Reputation: 5285
I have written a xml parser which successfully parses a xml file which is given as input.But sometime the input file that is given to may parser has double quote in a text property because of which my parser crashes.
Eg
<tag myprop=" this has a extra quote here like " some times" > </tag>
I know the tag that may /may not have the extra quote.I use a dom parser.
How can i handle this situation?
Upvotes: 1
Views: 2243
Reputation: 3247
see XML 1.0 specification, section 2.4:
http://www.w3.org/TR/xml/#attdecls
To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " &apos ; ", and the double-quote character (") as "" ;"."
so, since it's not valid XML your parser shouldn't be trying to handle the invalid value, it just needs to give an error.
Upvotes: 0
Reputation: 9498
You won't be able to use an XML parser until you have actual XML. What you currently have is invalid (ie not XML). You should escape the quote-mark inside the attribute beforehand.
The escaped code would look like:
<tag myprop=" this has a extra quote here like " some times" > </tag>
As to why your parser crashes, well there are dozens of XML libraries in existence - have you looked at any of those? I would personally expect to receive a ParseException or something like that.
Upvotes: 1
Reputation: 29493
You can not. That is not a valid XML, so the DOM parser will fail to parse.
Upvotes: 0
Reputation: 1427
I don't know for sure, but I think it's just invalid XML and so your parser should fail gracefully (rather than crashing) but I don't think it should successfully parse such a file.
Upvotes: 1