This filter allows you to process HTML5 documents. Those documents may contain ITS 2.0 markup.
The input documents are expected to be valid HTML5.
The filter decides which encoding to use for the input document using the detection mechanism defined by the specification.
The output encoding is the same as the input encoding, except if defined otherwise by the calling tool.
The type of line-breaks of the output is the same as the one of the original input.
By default the filter process the HTML5 documents based on the ITS defaults. That is:
langattribute is used as the local markup for the Language Information data category.
idattribute is used as the local markup for the Id Value data category.
- Most of the phrasing content elements are interpreted as
withinText="yes"for the Element Within Text data category.
translateattribute is used as the local markup for the Translate data category, and the behavior for that data category is different from the one in XML See the HTML5 definition for details.
Default behavior can be overridden when the input document contains ITS markup, or if a filter parameters file is specified. The parameters file used by the filter is an ITS document.
The Internationalization Tag set (ITS) is a W3C specification that defines a set of elements and attributes you can use to specify different internationalization- and localization-related aspects of your XML and HTML5 document, for instance: ITS allows you to define what element should be treated as a nested sub-flow of text, what element denotes a term, how to identify the language of a content, and much more.
The filter supports ITS 2.0.
- The ITS 2.0 specification is available at http://www.w3.org/TR/its20/.
See the "ITS" page for more details on the format.
The filter supports global and local rules and most data categories. See the ITS Components page for a detailed list of how the data categories are supported and other information on the implementation.
- This filter is BETA.