Reputation: 6449
I have a XML file which contains a part like below. img and br are not tags but when parsing, SAX considers img and br are tags, so because they don't have close tag, SAX raised error. How do i overcome this, how to ignore img and br when parsing. Thanks you!
<summary xml:base="http://www.dailymail.co.uk/health/index.html?ITO=1490" xml:lang="en-GB" type="html">
<img src="http://i.dailymail.co.uk/i/pix/2011/10/30/article-2055372-01A8032A0000044D-515_87x84.jpg" width="87" height="84"><br>Millions take statins to combat heart disease by lowering cholesterol, but research suggests that high cholesterol could be a key factor in the development of breast cancer.
</summary>
Upvotes: 1
Views: 1531
Reputation: 12009
That is not well-formed XML. In XML, every element must be closed, either with a closing tag (<br>...</br>
) or implicity as an empty tag (<br/>
). If some markup characters are required as text, then either they should be embedded in a CDATA section...
<![CDATA[This is my <em>character</em> data, not markup.]]>
... or by using character entity references:
This is my <em>character</em> data, not markup.
SAX has no way of knowing that some markup should be considered XML and other not just because they're HTML elements. If it sees <br>
, it's gonna assume that starts a br
element and a corresponding closing tag is going to be encountered later.
Upvotes: 1
Reputation: 11958
Tags must be closed.try <br/> and also add slash ( '/' ) symbol before img tag ends like this
<img src="path"/>
I've tried,it worked ;-)
Upvotes: 1
Reputation: 333
I think this XML is invalid - every parser will try to parse the img and br tags in that XML.
They should be surrounded by a CDATA tag so that they are not parsed:
http://www.w3schools.com/xml/xml_cdata.asp
Upvotes: 1