Reputation: 2059
Using flying saucer, i successfully convert html to image using below code
//doc - html source code as org.w3c.dom.Document
Java2DRenderer renderer = new Java2DRenderer(doc, width, height);
BufferedImage img = renderer.getImage();
ByteArrayOutputStream os = new ByteArrayOutputStream();
ImageIO.write(img, "jpg", os);
But i have problems in the above code like it does not render the font properly in the html.
Also if the chinese ,Japanese or other than Ascii characters given, the image has not been rendered with proper content(characters are boxed like below).
But actual html content is
<div ><ul><li><dl><dt><a href="http://jcs2014.com/ja/about/">イベントについて</a><br></dt><dd><ul><li><a href="http://jcs2014.com/ja/about/support.html">サポーター&フレンズ</a><br></li></ul></dd></dl><dl><dt><a href="http://jcs2014.com/ja/event/">イベント・セミナー一覧</a><br></dt></dl></li></ul><div><br></div></div>
Also in my case, any language will come, but all encoded using unicode. How to solve this.
Please help.
Upvotes: 2
Views: 2640
Reputation: 2059
String html = "<div ><ul><li><dl><dt><a href=\"http://jcs2014.com/ja/about/\">イベントについて</a><br></dt><dd><ul><li><a href=\"http://jcs2014.com/ja/about/support.html\">サポーター&フレンズ</a><br></li></ul></dd></dl><dl><dt><a href=\"http://jcs2014.com/ja/event/\">イベント・セミナー一覧</a><br></dt></dl></li></ul><div><br></div></div>"
//Read it using Utf-8 - Based on encoding, change the encoding name if you know it
InputStream htmlStream = new ByteArrayInputStream(html.getBytes("UTF-8"));
Tidy tidy = new Tidy();
org.w3c.dom.Document doc = tidy.parseDOM(new InputStreamReader(htmlStream,"UTF-8"), null);
Java2DRenderer renderer = new Java2DRenderer(doc, width, height);
BufferedImage img = renderer.getImage();
ByteArrayOutputStream os = new ByteArrayOutputStream();
ImageIO.write(img, "jpg", os);
This solves my issue. On reading html stream using UTF-8 solves the issue.
Upvotes: 1