Teo Choong Ping
Teo Choong Ping

Reputation: 12798

How to convert HTML to RTF in Java?

(I'm looking for a open source library)

Upvotes: 1

Views: 11322

Answers (2)

Niraj Chapla
Niraj Chapla

Reputation: 2179

You can convert HTML to RTF using basic Java APIs RTFEditorKit and HTMLEditorKit. It is not converting new line tags like <br/> and <p> to new line character equivalent in RTF. I have applied external fix for that as shown in following Java code.

private static String convertToRTF(String htmlStr) {

    OutputStream os = new ByteArrayOutputStream();
    HTMLEditorKit htmlEditorKit = new HTMLEditorKit();
    RTFEditorKit rtfEditorKit = new RTFEditorKit();
    String rtfStr = null;

    htmlStr = htmlStr.replaceAll("<br.*?>","#NEW_LINE#");
    htmlStr = htmlStr.replaceAll("</p>","#NEW_LINE#");
    htmlStr = htmlStr.replaceAll("<p.*?>","");
    InputStream is = new ByteArrayInputStream(htmlStr.getBytes());
    try {
        Document doc = htmlEditorKit.createDefaultDocument();
        htmlEditorKit.read(is, doc, 0);
        rtfEditorKit .write(os, doc, 0, doc.getLength());
        rtfStr = os.toString();
        rtfStr = rtfStr.replaceAll("#NEW_LINE#","\\\\par ");
    } catch (IOException e) {
          e.printStackTrace();
        } catch (BadLocationException e) {
          e.printStackTrace();
        }
    return rtfStr;
}

Here, I am replacing new line equivalent HTML tags to some special string and replacing back to new line representation chars sequence \par in RTF.

If you want to use more effective APIs and you have valid html, you should explore Apache-FOP.

Apache FOP can be used to convert to RTF. Following are some useful links -

http://www.torsten-horn.de/techdocs/java-xsl.htm#XSL-FO-Java

http://html2fo.sourceforge.net/index.html

Upvotes: 4

Related Questions