Reputation: 12798
(I'm looking for a open source library)
Upvotes: 1
Views: 11322
Reputation: 2179
You can convert HTML to RTF using basic Java APIs RTFEditorKit
and HTMLEditorKit
.
It is not converting new line tags like <br/>
and <p>
to new line character equivalent in RTF. I have applied external fix for that as shown in following Java code.
private static String convertToRTF(String htmlStr) {
OutputStream os = new ByteArrayOutputStream();
HTMLEditorKit htmlEditorKit = new HTMLEditorKit();
RTFEditorKit rtfEditorKit = new RTFEditorKit();
String rtfStr = null;
htmlStr = htmlStr.replaceAll("<br.*?>","#NEW_LINE#");
htmlStr = htmlStr.replaceAll("</p>","#NEW_LINE#");
htmlStr = htmlStr.replaceAll("<p.*?>","");
InputStream is = new ByteArrayInputStream(htmlStr.getBytes());
try {
Document doc = htmlEditorKit.createDefaultDocument();
htmlEditorKit.read(is, doc, 0);
rtfEditorKit .write(os, doc, 0, doc.getLength());
rtfStr = os.toString();
rtfStr = rtfStr.replaceAll("#NEW_LINE#","\\\\par ");
} catch (IOException e) {
e.printStackTrace();
} catch (BadLocationException e) {
e.printStackTrace();
}
return rtfStr;
}
Here, I am replacing new line equivalent HTML tags to some special string and replacing back to new line representation chars sequence \par in RTF.
If you want to use more effective APIs and you have valid html, you should explore Apache-FOP.
Apache FOP can be used to convert to RTF. Following are some useful links -
http://www.torsten-horn.de/techdocs/java-xsl.htm#XSL-FO-Java
http://html2fo.sourceforge.net/index.html
Upvotes: 4