Priya Singh
Priya Singh

Reputation: 21

How to Prevent Html Formatted data parsing through JAXB

I'm using JAXB to parse XML stream. This stream may contain HTML formatted data. When i'm unmarshalling this xml with jaxb for invalid html contents like <BR> with no end tag, <P> etc I get the following error:

javax.xml.bind.UnmarshalException
 - with linked exception:
[org.xml.sax.SAXParseException; lineNumber: 5; columnNumber: 2987; The element type "BR" must be terminated by the matching end-tag &lt;/BR&gt.]

at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)
at arserImpl$JAXPSAXParser.parse(Unknown Source)

Is there anyway through which I can prevent this HTML formatted data parsing/validating or comment some data in XML, which will be taken as a pure String.
Thanks in advance.

Upvotes: 1

Views: 844

Answers (2)

lexicore
lexicore

Reputation: 43689

You can use something like JTidy to turn your input into valid XML first.

Upvotes: 2

Don Roby
Don Roby

Reputation: 41135

This is failing because it is invalid XML. Your best solution would be to make whatever is producing this produce valid XML.

If you have the ability to preprocess this file, the way to make it treat portions of the data as plain text is to put it in a CDATA section.

Upvotes: 0

Related Questions