tzippy
tzippy

Reputation: 6638

Trying to parse non well-formed XML using NSXMLParser

I am parsing XML Data using NSXMLParser and I notice now, that the Elements can contain ALL characters, including for example a &. Since the parser is giving an error when it comes across this character I replaced every Occurence of this character. Now I want to make sure to handle every of these characters that may cause Errors. What are they and how do you think I should handle these characters best? Thanks in advance!

Upvotes: 0

Views: 584

Answers (2)

Simon Lee
Simon Lee

Reputation: 22334

You should encode these characters for instance & becomes & or " becomes "

When it goes through the parser it should come out ok. Your other option is to use a different XML parser like TBXML which doesn't do format checking.

Upvotes: 0

peterjb
peterjb

Reputation: 3189

To answer half your question, XML has 5 special characters that you may want to escape:

< -- replace with &lt;

> -- replace with &gt;

& -- replace with &amp;

' -- replace with &apos;

and

" -- replace with &quot;

Now, for the other half--how to find and replace these without also replacing all the tags, etc... Not easy, but I'd look in to regular expressions and NSRegularExpression: http://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html

Remember, depending on your use case, to escape the values of the parameters on tags, too; <tag parameter="with &quot;quotes&quot;" />

Upvotes: 2

Related Questions