Comet 4 offers from version 4.0.5 the possibility to export text frames as HTML documents . Style information is output as CSS file. The goal is an möglist exact representation of the InDesign® contents in the HTML document.
Since Comet 4.1 R20505 there is the possibility to import HTML-formatted text into InDesign® documents
The Comet Plugins export the formatted text of a frame. Style information of the document is collected and stored in a separate CSS file in the "resources" subfolder of the target folder. The CSS style information results from paragraph formats, character formats, table formats and cell formats. Basically all supported styles of the document are exported.
Please note: The HTML export does not generate HTML pages of the InDesign® document pages.
Since HTML has a text structure similar to that of InDesign®, the text structure is translated as follows:
Local style changes are written directly to the style attribute of the respective element.
Attribut | CSS | Info | ||||||||||||||||||||||||||||||||||||
Font family | font-family | |||||||||||||||||||||||||||||||||||||
Font face | font-weight |
In HTML/CSS it is hardly possible to directly specify a font style - the font style is set using strength, style and stretching. The values for these attributes are calculated using fontDB. Fonts not defined in fontDBIf the font is not described in the fontDB, the attributes are determined from the font name. Font names are not subject to any rules. Determining the font attributes from the font name is therefore only a very imprecise tool. In any case, you should make sure that all fonts used are also described in fontDB. The following parts of the font name are supported:
|
||||||||||||||||||||||||||||||||||||
font-style | ||||||||||||||||||||||||||||||||||||||
font-stretch | ||||||||||||||||||||||||||||||||||||||
Line spacing | line-height | |||||||||||||||||||||||||||||||||||||
Letter type | text-transform | Only 'capital' - text-transform:uppercase | ||||||||||||||||||||||||||||||||||||
Position | vertical-align |
|
||||||||||||||||||||||||||||||||||||
Underline | text-decoration | text-decoration: underline | ||||||||||||||||||||||||||||||||||||
Line-through | text-decoration | text-decoration: line-through | ||||||||||||||||||||||||||||||||||||
Vertical align | vertical-align | Nicht kompatibel mit "Position" | ||||||||||||||||||||||||||||||||||||
Alignment | text-align |
|
||||||||||||||||||||||||||||||||||||
List type: Bullet | Only normal CSS/HTML bullet characters for <ul> are possible. | |||||||||||||||||||||||||||||||||||||
Listentyp: Numbered | list-style-type |
|
||||||||||||||||||||||||||||||||||||
Listentyp: Numbered, mode | start | "Continue numbering" and "Begin with" are both supported | ||||||||||||||||||||||||||||||||||||
Character color | color: rgb(r, g, b) color: #FF0000 |
Color names (Swatches) are not supported |
Character styles are export as <span>-elements in HTML. The supported parameters are the same as with paragraph styles.
Tables are exported as <table>-elements in HTML. They support header and footer rows, merged cells and the following table format attributes:
Attribute | CSS | Info | ||||||||||||||||
Cell format | All settings are supported | |||||||||||||||||
Table contour, strength | border-left-width border-right-width border-top-width border-bottom-width |
|||||||||||||||||
Table contour, Color | border-left-color border-right-color border-top-color border-bottom-color |
|||||||||||||||||
Table contour, Style | border-left-style, border-right-style, border-top-style, border-bottom-style |
|
||||||||||||||||
Fill, alternating pattern |
Table.Tablename tr oder td:nth-child(Blocksize + Skip first row/column):nth-last-child(n + Skip last row/column), fill color ) |
To translate this attribute to a CSS style, two style definitions have to be made to define both fill colors. Unfortunately defining a block size is impossible, so each index of the block needs a custom selector. e.g. First two rows cyan colored, next three rows magenta colored (Block size = 5), Skip first three rows, skip last four rows: table.priint_KTabelle tr:nth-child(5n + 4):nth-last-child(n + 5), tr:nth-child(5n + 5):nth-last-child(n + 5) { background-color: rgba(0, 158, 227, 1.00); } Tabelle.priint_KTabelle tr:nth-child(5n + 6):nth-last-child(n + 5), tr:nth-child(5n + 7):nth-last-child(n + 5), tr:nth-child(5n + 8):nth-last-child(n + 5) { background-color: rgba(229, 0, 125, 1.00); } |
Table cells are exported as <td>-elements. The following format options are supported:
Attribute | CSS | Info | ||||||||||||||||
Paragraph style | Set the paragraph style of the content | |||||||||||||||||
Text rotation | transform: rotate(%ddeg) | Applies to the <p> element inside the cell, otherwise the cell will rotate. | ||||||||||||||||
Cell contour, thickness | border-left-width border-right-width border-top-width border-bottom-width |
|||||||||||||||||
Cell contour, type | border-left-style, border-right-style, border-top-style, border-bottom-style |
|
||||||||||||||||
Cell contour, Color | border-left-color border-right-color border-top-color border-bottom-color |
|||||||||||||||||
Cell surface, Color | background-color: rgb(r, g, b) background-color: #00FF00 |
|||||||||||||||||
Cell offset | padding-left padding-top padding-right padding-bottom |
Any InDesign® control characters are either not available in HTML or have a different meaning. These characters will be exported as follows:
<?ACE HEXCODE ?>, e.g. <?ACE 8 ?> for the right-aligned TAB.
The following characters are treated like this:
Hexcode | Normal meaning | InDesign® Meaning |
0003 | End of text | Exit nested format here |
0004 | End of transmisson | Footnote |
0007 | Bell | feed to here |
0008 | Backspace | Tabulator for right orientation |
0016 | Synchronous idle | Table anchor |
0017 | End of transmission block | Table continuation |
0018 | Cancel | Page number |
0019 | End of medium | Paragraph name |
001A | Substitutes | "non roman special glyph" |
When Import these characters are of course converted back.
To define TaggedText directly in HTML, use the pseudo tag
<?IDTT ?>
The content of these tags is directly included in the result without further conversions. Whitspaces between the text and the ? will be ignored.
Inline frames are exported as separate HTML documents and linked via the iframe element. The filename is the UID of the frame + ".html". Inline text frames and picture frames are supported. In the case of an image frame, there is an option to link the image only or copy it to the "resources" subfolder of the destination folder. If a linked image no longer exists, a PNG of the image will be exported and also placed in the "resources" subfolder. The image name corresponds to the original name of the image, with the extension ".png" if the original image was not a PNG.
InDesign® distinguishes between two types of style hierarchies: The first type determines on which style a style is based, the second under which style folder it is subordinate. Both types are taken into account in HTML export.
To maintain the style folder hierarchy, styles are noted in the CSS as follows:
p.formatGroup1.paragraph format2 { }
application in the document takes place as follows:
<p class="FormatGruppe1 Absatzformat2"></p>
This way you can track which style was in which style folder.
It is unfortunately not directly possible in CSS to create styles based on other styles. The Comet solves this problem by noting selectors for sub styles in the CSS styles. The styles have as properties the difference to their parent style. Then only the lowest style of the hierarchy chain is used.
In the following example, "paragraph format1" changes the typeface, "paragraph format2" is based on "paragraph format1" and changes the font size.
The CSS notation is as follows:
p.paragraphFormat1, p.paragraphFormat2 { font-family: 'Calibri'; }
p.paragraphFormat2 { font-size: 14pt; }
application in the document takes place as follows:
<p class="paragraphFormat2"></p>
In InDesign®, it is possible to assign names to styles without any major restrictions. In HTML and CSS you are unfortunately somewhat restricted. In CSS the following characters have a special meaning
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ ` { | } ~
In InDesign® inheritance is already built into the definition of styles, and spaces are (normal) part of the style name. If you want to use styles with spaces in the HTML code, the spaces must be encoded accordingly, e.g. with %20. More info about the encodings can be found below.
Furthermore CSS names must not start with a number or with a hyphen followed by a number. For this reason, each style name gets the prefix "priint_".
After Comet 4.1 R21800 there are two different ways to export style names. For this the parameter "kCSSEscapeMode" of the cScript function html::export_ is used.
The following modes are available:
Hex mode:
In this mode, each of the characters listed above is replaced by a hex escape sequence: 0x + hex code of the character. For example 0x002B for the equal sign. You should take care to avoid such sequences in your original style names, as a possible translation is no longer unambiguous. The advantage of this mode is that style names in CSS and HTML are exactly the same.
Examples:
Style name | CSS & HTML |
Hello World | priint_Hallo0x0020World |
x< and %20 | priint_x0x003E0x0020and0x00200x002520 |
AAA% %BBB | priint_AAA0x00250x00200x0025BBB |
Slash mode:
This mode places a backslash in front of each unsupported character in the CSS definition, but not in the HTML text application. This makes the style names easier to read, but CSS definition and HTML application are different, which makes editing by e.g. text substitution more difficult. In addition, some characters are handled separately:
Character | CSS | HTML |
Space | \%20 | %20 |
% | \%25 | %25 |
< | \< | < |
> | \> | > |
" | \" | " |
& | \& | & |
Examples:
Style Name | CSS | HTML |
Hello World | priint_Hello\%20World | priint_Hello%20World |
x< and %20 | priint_x\>\%20and\%20\%2520 | priint_x>%20and%20%2520 |
AAA% %BBB | priint_AAA\%25\%20\%25BBB | priint_AAA%25%20%25BBB |
The HTML import is currently under development and is not part of the support of Werk!
Please note that only a limited set of HTML and CSS features can be supported.
Following tags are supported:
Tag | Meaning | Info |
<p> | Paragraph | |
<span> | Character style or local change of supported text attributes | |
<table> | Table | Either: Only <tr> subnodes. Or: <thead>, <tbody>, and <tfoot> with <tr> subnodes. |
<thead> | Table header | Direct under <table>, in conjunction with <tbody> and (optional) <tfoot> |
<tbody> | Table body | Direct under <table>, in conjunction with (optional) <thead> and (optional) <tfoot> |
<tfoot> | Table foot | Direct under <table>, in conjunction with <tbody> and (optional) <thead> |
<tr> | Table row | |
<td> | Table cell | supporting node for tables - is added underneath <tr>. |
<ul> | List with bulletpoints | |
<ol> | Numbered list | |
<li> | List element | Supporting node for lists - is added to <ol> or <ul>. |
<?ACE ...?> | InDesign® Control character | Tag for InDesign® control character inserts. See here. |
<i> <em> |
Italic |
The fontDB is used to calculate the required font style of the current font family fontDB. If no font family is specified in the HTML text, the font family of the insert is used in the InDesign® text. |
<b> <strong> |
Bold | |
</ i> <del> <strike> |
Line through | Overridden by CSS attribute text-decoration: line-through. See here. |
<u> | Underline | Overridden by CSS attribute text-decoration: underline. See here. |
<br> | Soft return | \n |
<sup> | Super script | |
<sub> | Sub script | |
<image> | Image |
Other tags like <div> or comments (<-- ... -->) are ignored
Currently the CSS style definitions written by the export not imported!
According to the HTML export, formats on the HTML node are determined by the "class" attribute. The "class" attribute of the following nodes corresponds to the following formats:
<p>, <li> | Paragrah style |
<span> | Character style |
<table> | Table style |
<td> | Cell style |
"priint_"-Prefixes in style names are removed. Characters reserved in HTML that have been replaced by an escape sequence are unescaped.
The import of CSS attributes is currently only supported as a style attribute of an HTML node (e.g. <span style="font-size:20pt">).
The following applies: The "lowest" attribute always has priority, i.e. the attribute that is closest to the content in the hierarchy. For example, the text attribute "font-size" will be overwritten on a <p> node if a <span> node sets the same attribute below it.
The following attributes are supported:
CSS Attribute | Supported values / units | Info |
font-family | Only family-name values, not generic-family. See here | All values must be enclosed by a leading sign! e.g. style="font-family: 'Minion Pro' "
More info here. |
font-size | int or float, e.g. font-size: 24pt; |
Font size in points [since v4.1 R23700] The following units are allowed: pt Relative sizes are not supported! |
font-weight | 100, 200, ..., 900 400 corresponds to normal |
To calculate the required font style of the current font family the fontDB is used. If no font family is specified in the HTML text, the font family of the insert position in InDesign® is used. |
font-style | italic oblique |
|
font-stretch | ultra-condensed extra-condensed condensed semi-condensed normal semi-expanded expanded extra-expanded ultra-expanded |
|
color | rgb(int, int, int) #0000FF MyColor 'Meine Farbe' |
Font color Support of named document swatches since v4.3 R34050 |
text-decoration | underline line-through |
|
CSS Attribute | Supported values / units | Info |
text-align | left, right, center, justify |
Text alignment. For the value justify the tag <pTextAlignment:JustifyFull> is used. Other blockset formats are not supported. |
CSS Attribute | Supported values / units | Info |
width, height |
int or float |
column width in points. In case of contradictory column widths in the individual cells of a column, the largest is used. [since v4.1 R23700] The following units are allowed: pt Relative sizes and direct attributes like <td width="60pt">Title</td> are not supported! Here is an example: <td style="width:60pt">Title</td> |
HTML entities in decimal (e.g.  ) and hexadecimal notation (e.g.  ) are supported and replaced by the corresponding characters (e.g. for html::to_tagged or direct import into Illustrator). Additionally, keywords for HTML entities (e.g. ) are supported. You can find the complete list at https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
There are cScript functions available for importing and exporting HTML:
Export:
Import:
Misc.:
Simple export of a text frame:
int main() { String folder = string::alloc(); String docName = string::alloc("HTML Export"); int err = 0; err = file::select_folder(folder); if (err) { return 0; } document::name(docName); html::export_frame( gFrame, "kOutputFolder", folder, "kOutputName", docName, "kDocTitle", "Hello World", "kCSSEscapeMode", 0); return 0; }