Mark Maslar
Mark Maslar

Reputation: 1121

How to prevent XElement from decoding character entity references

I have an XML string that contains an apostrophe. I replace the apostrophe with its equivalent & parse the revised string into an XElement. The XElement, however, is turning the ' back into an apostrophe.

How do I force XElement.Parse to preserve the encoded string?

string originalXML = @"<Description><data>Mark's Data</data></Description>"; //for illustration purposes only
string encodedApostrophe = originalXML.Replace("'", "&#39;");
XElement xe = XElement.Parse(encodedApostrophe);

Upvotes: 0

Views: 2809

Answers (1)

svick
svick

Reputation: 244777

This is correct behavior. In places where ' is allowed, it works the same as &apos;, &#39; or &#x27;. If you want to include literal string &#39; in the XML, you should encode the &:

originalXML.Replace("'", "&amp;#39;")

Or parse the original XML and modify that:

XElement xe = XElement.Parse(originalXML);

var data = xe.Element("data");

data.Value = data.Value.Replace("'", "&#39;");

But doing this seems really weird. Maybe there is a better solution to the problem you're trying to solve.

Also, this encoding is not “ASCII equivalent”, they are called character entity references. And the numeric ones are based on the Unicode codepoint of the character.

Upvotes: 1

Related Questions