Reputation: 43
I want TagSoup settings to use HTML5 standars.
I am using tagsoup Parser which is adhearing to HTML4 which doesn't allow a <div>
inside an <a>
tag. hence, parsing a wrong HTML. However, HTML5 allows the use of the same. How do I makethe tagsoup (org.ccil.cowan.tagsoup) to use HTML5 standards.
eg,
<a>
<div></div>
</a>
becomes,
<a></a>
<div></div>
Upvotes: 3
Views: 281
Reputation: 21
I had the same problem with the following structure:
<a>
<li></li>
<p></p>
</a>
became,
<a>
<li></li>
</a>
<p></p>
I resolved it by using a custom HTMLSchema:
private class CustomHTMLSchema extends HTMLSchema
{
public CustomHTMLSchema()
{
super();
ElementType elA = getElementType("a");
elA.setModel(elA.model() | M_BLOCK);
}
}
...
saxParser = SAXParserImpl.newInstance(null);
saxParser.setProperty(Parser.schemaProperty, new CustomHTMLSchema());
Upvotes: 2