java.lang.Object
com.priint.pubserver.plugin.PluginControlDefault
com.priint.pubserver.comet.bridge.dataprocessing.HtmlFunctions
All Implemented Interfaces:
com.priint.pubserver.plugin.interfaces.PluginControl

public class HtmlFunctions extends com.priint.pubserver.plugin.PluginControlDefault

Configuration

The plug-in can be configured.

HtmlToTaggedText configuration:

The configuration uses a simple string for the XSL transformation to use in transforming the internal extended XHTML. "

 htmlToTaggedText.xslContent:   XSL as a string.
 htmlToTaggedText.xslArguments: Default arguments for XSL. To be overridden by args augment of method "htmlToTaggedText()".
 tidyToXhtml.config:            Default arguments for tidy. To be overridden by args augment of convert method "tidyToXhtml()".
 
Sample Configuration File: (Note: XSL is only a stub)
 <?xml version="1.0" encoding="UTF-8"?>
 <con:PluginConfig xmlns:con="com.priint.pubserver.config.manager/20130620">
  <con:name>config.xml</con:name>
  <con:type>htmlFunctions</con:type>
  <con:custom>
   <htmlFunctionsConfig version="2.0">
    <htmlToTaggedText>
     <xslContent>&lt;?xml version='1.0' encoding='UTF-8'?&gt;&lt;xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'&gt;&lt;!-- insert xsl statements about here --&gt;&lt;/xsl:stylesheet&gt;</xslContent>
     <xslArguments>arg1=value1,arg2=value2</xslArguments>
    </htmlToTaggedText>
    <tidyToXhtml>
     <config>bare=no</config>
    </tidyToXhtml>
   </htmlFunctionsConfig>
  </con:custom>
  <con:dependencies />
  <con:instances />
 </con:PluginConfig>
 
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
    Use this name to instantiate the plug-in with pubserver.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    afterCreateConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection createdConfigs)
     
    void
    afterDeleteConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection deletedConfigs)
     
    void
    afterUpdateConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection updatedConfigs)
     
    htmlToTaggedText(List<String> inputList, String configName, String args)
    Converts a list of HTML snippet into InDesign tagged text.
    void
     
    tidyToXhtml(List<String> inputList, String config)
    Converts a list of HTML snippet into valid XHTML snippets containing the html body content only.

    Methods inherited from class com.priint.pubserver.plugin.PluginControlDefault

    createConfiguration, deleteConfigurations, getSession, getSessionId, initInstance, ping, readSessionAttribute, updateConfigurations, writeSessionAttribute

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface com.priint.pubserver.plugin.interfaces.PluginControl

    validateConfigurations
  • Field Details

    • MAPPED_NAME

      public static final String MAPPED_NAME
      Use this name to instantiate the plug-in with pubserver.
      See Also:
  • Constructor Details

    • HtmlFunctions

      public HtmlFunctions()
  • Method Details

    • loadServerConfig

      public void loadServerConfig() throws com.priint.pubserver.plugin.exception.PluginException
      Specified by:
      loadServerConfig in interface com.priint.pubserver.plugin.interfaces.PluginControl
      Overrides:
      loadServerConfig in class com.priint.pubserver.plugin.PluginControlDefault
      Throws:
      com.priint.pubserver.plugin.exception.PluginException
    • afterCreateConfigurations

      public void afterCreateConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection createdConfigs) throws com.priint.pubserver.plugin.exception.PluginException
      Specified by:
      afterCreateConfigurations in interface com.priint.pubserver.plugin.interfaces.PluginControl
      Overrides:
      afterCreateConfigurations in class com.priint.pubserver.plugin.PluginControlDefault
      Throws:
      com.priint.pubserver.plugin.exception.PluginException
    • afterUpdateConfigurations

      public void afterUpdateConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection updatedConfigs) throws com.priint.pubserver.plugin.exception.PluginException
      Specified by:
      afterUpdateConfigurations in interface com.priint.pubserver.plugin.interfaces.PluginControl
      Overrides:
      afterUpdateConfigurations in class com.priint.pubserver.plugin.PluginControlDefault
      Throws:
      com.priint.pubserver.plugin.exception.PluginException
    • afterDeleteConfigurations

      public void afterDeleteConfigurations(String sessionId, com.priint.pubserver.config.PluginConfigCollection deletedConfigs) throws com.priint.pubserver.plugin.exception.PluginException
      Specified by:
      afterDeleteConfigurations in interface com.priint.pubserver.plugin.interfaces.PluginControl
      Overrides:
      afterDeleteConfigurations in class com.priint.pubserver.plugin.PluginControlDefault
      Throws:
      com.priint.pubserver.plugin.exception.PluginException
    • htmlToTaggedText

      public List<String> htmlToTaggedText(List<String> inputList, String configName, String args)
      Converts a list of HTML snippet into InDesign tagged text.

      Supported HTML

      • Not supported structure elements
        • Tags:
          head, button, fieldset, form, input, optgroup, option, textarea, del, menu, applet, area, base, basefont, frame, frameset, iframe, isindex, link, map, meta, noframes, noscript, object, param, script, select, style, title
        • The listed HTML tags won't be evaluated and will be ignored in tagged text. This also affects the header of HTML pages, forms, frames, scripts, styles, embends etc.
      • Supported structure elements
        • Tags:
          html, body, abbr, acronym, bdo, big, center, cite, code, dfn, font, ins, kbd, label, legend, q, s, samp, small, span, strike, tt, var
        • Contents (text and subelements) of the listed HTML tags are evaluated for displaying them as tagged text. The elements themselves don't do anything, no paragraph will be inserted, no styles will be set.
      • Free text in HTML or BODY
        • Free text is not supported. As in, text paragraphs always have to be encapsuled in P or div elements.
      • Inline Formatting
        • Styles support is as follows:
        • Tags: i, em, b, strong
          • for "bold", "italic" or "bold and italic".
        • Tags: sub, sup
          • For superscript and subscript text.
        • Tags: u
          • For underlined text.
        • Line breaks and rulers
          • Tags:
            br, hr
          • Both tags lead to a hard line break.
          • With hr no line will be displayed.
        • Hyperlinks
          • Tags: a
          • Hyperlinks will be rendered accordingly. The link target (href) stays in InDesign for use in interactive PDF documents.
          • Formatting hyperlinks (e.g. bold. italic) is not possible.
        • Paragraphs
          • Tags:
            p, div, caption, address, blockquote, dt, dd, h1, h2, h3, h4, h5, h6, pre
          • Contents of listed elements are rendered as paragraphs in tagged text. If no CSS classes were applied to these elements, the paragraph style in InDesign is called "Paragraph-TagName", e.g. "Paragraph-p" or "Paragraph-address".
        • Listings and numberings
          • Tags:
            ol, ul, li
          • Listings and numberings are supported with the restrictions stated below.
          • li-elements will be inserted into InDesign as paragraphs with appropriate style names. In HTML, the style can be applied to lists (ol or ul) as well as listelements (li). This results in "UnorderedList-c1" or "OrderedList-c2" (with <ul class="c1"><li></li></ul> and <ol><li class="c2"></li></ol>).
          • Hierarchic lists are not supported.
          • Lists always begin with the element "1", i.e. the attribute "start" or continuous lists are not supported.
          • Lists of the types menu or dl, dt, dd are not supported.
        • Tables are supported.
          • Tags:
            table, thead, tbody, tfoot, tr, th, td, caption, colgroup, col
          • Merged cells are not supported (colspan and rowspan attribute)
          • Nested Tables should not be used.
          • The following table settings are dealt with separately:
            • caption
              • treated as leading text paragraph of a table.
            • colgroup, col
              • if with the parameter "args" carries values for "table-width" and "colwidth-myclass", then columns of a tagged text table will be adjusted to the appropriate width. Thereby "table-width" delivers the width in scale units usable by tagged text, for example "480" (dots). Whereas "colwidth" delivers percentual values for tablewidths, mainly in ratios of 1.0. I.e. "0.3" for "30%" of "480" dots. This results in 144pt for the width of a column.
      Parameters:
      inputList - List of strings containing valid XHTML fragments. Typically list length will be 1.
      configName - Name of a PubServer configuration file containing values for style mappings, XSL transformation, and XSL arguments to override the defaults. Typically the build-in XSL stylesheet and XSL arguments will be enough. Mappings rules for HTML tags/classes to InDesign paragraph and character classes can be added via config file.
      args - Comma separated list of string argument pairs for use in the XSL transformation. Key and value of an argument pair is separated by equal sign. So a valid args string could be "basehref=D:/Bilder,table-width=480". Comma in key or value has to be escaped by "%2C" ("url escaping"). Strings will be trimmed from outer whitespace.

      Supported arguments in default transformation

      • basehref:
        base path for image source resolution
      • tag-style:
        empty or "comet" for WERK II Tagged Text. If empty string Adobe InDesign Tagged Text will be used. "comet" is default.
      • start-file-tag:
        empty or custom string. If empty default value depends on tag-style. Tag style "comet" will use "%TT" as prefix. Adobe InDesign Tagged Text will start with "<ANSI-WIN>".
      • paragraph-placement:
        "prepend" or "append" or "between". Controls whether PARAGRAPH SEPARATOR will be added before, after or only between paragraphs. Defaults to "prepend".
      Returns:
      list List of Tagged Text output strings corresponding to the input. Tagged Text follows the convention for Comet Plug-in for InDesign. Please see the documentation for Comet desktop plug-ins.
    • tidyToXhtml

      public List<String> tidyToXhtml(List<String> inputList, String config) throws com.priint.pubserver.exception.PubServerException
      Converts a list of HTML snippet into valid XHTML snippets containing the html body content only.

      Process can be configured by tidy options as document e.g. http://tidy.sourceforge.net/docs/quickref.html
      Additional config settings for tidy process specific to tidyToXhtml(List, String) are:

      • normalize-whitespace (default: true)
        Replaces all series of whitespace into a single space (similar to well known xpath function normalize()). If you do not need this behavior add "normalize-whitespace=false" to the config string.
      • trim-whitespace-after-pi (default: true)
        This will trim one space after a processing instruction if and only if the processing instruction is followed by an empty embed tag. This feature can be used to workaround a shortcoming in JTidy processing that will add additional whitespace to after a PI. If you do not need this behavior add "trim-whitespace-after-pi=false" to the config string.
      • preserve-empty-paragraphs (default: true)
        Default JTidy will convert empty <P></P> into <br /><br />. tidyToXhtml will preserve the Ps in accordance with Dave Raggett original HtmlTidy. If you do need the JTidy behavior add "preserve-empty-paragraphs=false" to the config string. This is not the same as the default tidy option for "drop-empty-paras" which is set to false by default.
      Parameters:
      inputList - List of html string snippets to be converted into xhtml
      config - Additional settings for tidy process. A comma separated list of key=value pairs overriding default properties of tidy process.
      E.g. assume-xml-procins=no,quote-marks=yes
      Be careful when using this feature for it may break xml parsing or the result.
      Returns:
      valid xhtml body content fragment
      Throws:
      com.priint.pubserver.exception.PubServerException - is thrown if an unknown internal exception is thrown during html parsing and cleaning.