Reputation: 315
I am using tinymce editor in my project. HTML markup generated by editor is parsed using Jsoup (v.1.7.2) and is used to generate pdf using Apache FOP. When user uses features of editor itself it generates valid html markup but if some user uses tool to include source code from other source directly and let's say he enters,
<ul>
<ul>
<ul>
<li>
one
</li>
<li>
two
</li>
<li>
three
</li>
<li>
four
</li>
</ul>
</ul></ul>
the edior is not fixing markup to,
<ul>
<li>
one
</li>
<li>
two
</li>
<li>
three
</li>
<li>
four
</li>
</ul>
As per https://validator.w3.org/nu/#textarea the first markup is not valid,
Error: Element ul not allowed as child of element ul in this context.
Is fixing html markup possible in tinymce editor or with Jsoup parser, If not any other approach?
Upvotes: 2
Views: 136
Reputation: 36
You can try using JTidy,
Tidy tidy = new Tidy();
tidy.setXHTML(true);
final InputStream inputStream = new FileInputStream("input.html");
tidy.parse(inputStream, System.out);
Upvotes: 2