user31673
user31673

Reputation: 13695

How to decode string to XML string in C#

I have a string (from a CDATA element) that contains description of XML. I need to decode this string into a new string that displays the characters correctly using C#

Existing String:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><myreport xmlns="http://test.com/rules/client"><admin><ordernumber>123</ordernumber><state>NY</state></report></myreport>

String Wanted:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<myreport xmlns="http://test.com/rules/client">
<admin><ordernumber>123</ordernumber><state>NY</state></report></myreport>

Upvotes: 48

Views: 80112

Answers (7)

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56162

  1. HttpUtility.HtmlDecode from System.Web
  2. WebUtility.HtmlDecode from System.Net

Upvotes: 52

Sharthak Ghosh
Sharthak Ghosh

Reputation: 596

HttpUtility.HtmlDecode(xmlString) will solve this issue

Upvotes: 1

Ghasem
Ghasem

Reputation: 15573

You just need to replace the scaped characters with their originals.

string stringWanted= existingString.Replace("&lt;", "<")
                                                   .Replace("&amp;", "&")
                                                   .Replace("&gt;", ">")
                                                   .Replace("&quot;", "\"")
                                                   .Replace("&apos;", "'");

Upvotes: -1

Andrei S
Andrei S

Reputation: 9

You can use HTML.Raw. That way the markup is not encoded.

Upvotes: 0

Noah Stahl
Noah Stahl

Reputation: 7553

You might also consider the static parse method from XDocument. I'm not sure how it compares to others mentioned here, but it seems to parse these strings well.

Once you get the resulting XDocument, you could turn around with ToString to get the string back:

string parsedString = XDocument.Parse("<My XML />").ToString();

Upvotes: -2

matabares
matabares

Reputation: 838

You can use System.Net.WebUtility.HtmlDecode instead of HttpUtility.HtmlDecode

Useful if you don't want System.Web reference and prefer System.Net instead.

Upvotes: 46

Wernight
Wernight

Reputation: 37600

As Kirill and msarchet said, you can use HttpUtility.HtmlDecode from System.Web. It escapes pretty much anything correctly.

If you don't want to reference System.Web you might use some trick which supports all XML escaping but not HTML-specific escaping like &eacute;:

public static string XmlDecode(string value) {
    var xmlDoc = new XmlDocument();
    xmlDoc.LoadXml("<root>" + value + "</root>");
    return xmlDoc.InnerText;
}

You could also use a RegEx or simple string.Replace but it would only support basic XML escaping. Things like &#x410; or &eacute; are examples that would be harder to support.

Upvotes: 6

Related Questions