Reputation: 41
We're generating HTML files out of apaches velocity generic template engine. The generated HTML is kind of ugly and not with correcht indentation.
In my case I've got the HTML stored in a String which I want to manipulate in this way, that it looks pretty printed.
I've already gave JTidy a try, but it changes the HTML source code when I pipe the raw HTML trough it. Sometimes it adds or removes HTML tags.
My question:
Is there a java library or something else out there which (only!) pretty prints my HTML code without adding, removing tags from my HTML document? It shall only do the indentation, so that it looks pretty printed! Nothing more, nothing less. Any ideas? :-)
Also code suggestions, hints or tips are welcome.
Best regards
Upvotes: 4
Views: 3659
Reputation: 765
Maybe a little to late, but I found a solution to this with Jsoup.
you can get the "pretty" version of the html by using only the parser, and (in case of needed) avoid the generation of the html elements by using a "custom parser"
I got the answer from this Jsoup question
And its
public static String formatHTML(String html) throws Exception{
Document doc = Jsoup.parse(html, "", Parser.xmlParser());
return doc.toString();
}
I hope this helps.
Regards
Upvotes: 2
Reputation: 31182
Find any SAX parser example in java. indent++ for opening tags, intent-- for closing, and write content with counted intentation.
Upvotes: 1
Reputation: 11
Why don't you write a simple Java parser to pretty print HTML yourself. Here is a sketch:
I wanted to give you a rough idea here, you can use this as a starting point. I have written many perl based pretty printers. You could use Perl to script a parse fairly quickly..
Upvotes: 0