csharpnewbie
csharpnewbie

Reputation: 829

Parse CData from XML in C#

Am trying to parse my xml which has CData tag as the value for one of its nodes. My XML structure is as below.

<node1>
<node2>
<![CDATA[ <!--@@@BREAK TYPE="TABLE" @@@--> <P><CENTER>... html goes here.. ]]>
</node2>
</node1>

My code is as below. When I parse I get response with CData tag and not the value in the CData tag. Can you pls help me fix my problem?

XDocument xmlDoc = XDocument.Parse(responseString);
XElement node1Element = xmlDoc.Descendants("node1").FirstOrDefault();
string cdataValue = node1Element.Element("node2").Value;

Actual Output: <![CDATA[ <!--@@@BREAK TYPE="TABLE" @@@--> <P><CENTER>... html goes here.. ]]>

Expected Output:  <!--@@@BREAK TYPE="TABLE" @@@--> <P><CENTER>... html goes here..

I was not sure if System.XML.Linq.XDocument was causing the problem. So I tried XMLDocument version as below.

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(responseString);
XmlNode node = xmlDoc.DocumentElement.SelectSingleNode(@"/node1/node2");
XmlNode childNode = node.ChildNodes[0];
if (childNode is XmlCDataSection)
{}

And my if loop returns false. So looks like there is something wrong with my xml and it is actually not a valid CData? Pls help me fix the problem. Pls let me know if you need more details.

Upvotes: 3

Views: 14840

Answers (4)

Esteban Saavedra
Esteban Saavedra

Reputation: 1

I resolved this case in this form:

XDocument xdoc = XDocument.Parse(vm.Xml);

XNamespace cbc = @"urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2";
  var list2 =
       (from el in xdoc.Descendants(cbc + "Description")
        select el).FirstOrDefault();

      var queryCDATAXML = (from eel in list2.DescendantNodes()                                                
      select eel.Parent.Value.Trim()).FirstOrDefault();

Upvotes: 0

MaxKlaxx
MaxKlaxx

Reputation: 763

i tried your code and the CData value are correct... ?!?

how you fill your reponseString? :-)

static void Main(string[] args)
{
  string responseString = "<node1>" +
                          "<node2>" +
                          "<![CDATA[ <!--@@@BREAK TYPE=\"TABLE\" @@@--> <P><CENTER>... html goes here.. ]]>" +
                          "</node2>" +
                          "</node1>";

  XDocument xmlDoc = XDocument.Parse(responseString);
  XElement node1Element = xmlDoc.Descendants("node1").FirstOrDefault();
  string cdataValue = node1Element.Element("node2").Value;

  // output:  <!--@@@BREAK TYPE=\"TABLE\" @@@--> <P><CENTER>... html goes here.. 
}

Upvotes: 0

csharpnewbie
csharpnewbie

Reputation: 829

It was because StreamReader was escaping the html. So "<" was getting changed to "&lt;". Hence it was not getting recognized correctly as a cdatatag. So had to do unescape first - XDocument xmlDoc = XDocument.Parse(HttpUtility.HtmlDecode(responseString))

and that fixed it.

Upvotes: 0

Jeff Mercado
Jeff Mercado

Reputation: 134601

What you're describing will never actually happen. Getting the Value of a node that contains cdata as a child will give you the contents of the cdata, the inner text. You should already be getting your expected output.

The only way you can get the actual cdata node is if you actually get the cdata node.

var cdata = node1Element.Element("node2").FirstNode;

Upvotes: 4

Related Questions