Reputation: 4410
I have some code which returns InnerXML for a XMLNode.
The node can contain just some text (with HTML) or XML.
For example:
<XMLNode>
Here is some <strong>HTML</strong>
<XMLNode>
or
<XMLNode>
<XMLContent>Here is some content</XMLContnet>
</XMLNode>
if I get the InnerXML for <XmlNode>
the HTML tags are returned as XML entities.
I cannot use InnerText because I need to be able to get the XML contents. So all I really need is a way to un-escape the HTML tags, because I can detect if it's XML or not and act accordingly.
I guess I could use HTMLDecode, but will this decode all the XML encoded entities?
Update: I guess I'm rambling a bit above so here is a clarified scenario:
I have a XML document that looks like this:
<content id="1">
<data><p>A Test</p></data>
</content id="2">
<content>
<data>
<dataitem>A test</dataitem>
</data>
</content>
If I do:
XmlNode xn1 = document.SelectSingleNode("/content[@id=1]/data");
XmlNode xn2 = document.SelectSingleNode("/content[@id=2]/data");
Console.WriteLine(xn1.InnerXml);
Console.WriteLine(xn2.InnerXml);
xn1 will return
<p>A Test</p>
xn2 will return <dataitem>A test</dataitem>
I am already checking to see if what is returned is XML (in the case of xn2) so all I need to do is un-escape the <
etc in xn1.
HTMLDecode does this, but I'm not sure it would work for everything. So the question remains would HTMLDecode handle all the possible entities or is there a class somewhere that will do it for me.
Upvotes: 2
Views: 2490
Reputation: 96770
I think Tomalak is on the right track, but I'd write the code a little differently:
XmlNode xn = document.SelectSingleNode("/content[@id=1]/data");
if (xn.ChildNodes.Count != 1)
{
throw new InvalidOperationException("I don't know what to do if there's not exactly one child node.");
}
XmlNode child = xn.ChildNodes[0];
switch (child.NodeType)
{
case XmlNodeType.Element:
Console.WriteLine(xn.InnerXml);
break;
case XmlNodeType.Text:
Console.WriteLine(xn.Value);
break;
default:
throw new InvalidOperationException("I can only handle elements and text nodes.");
}
This code makes a lot of your implicit assumptions explicit, and when you encounter data that's not in the form you expect, it will tell you why it failed.
Upvotes: 1
Reputation: 338278
Your question is a bit hard to follow. Here are the things that I did not fully understand:
EDIT
I think I get the picture, but correct me if I'm still wrong. You want to pluck "<p>A Test</p>"
out of xn1
, but "A test"
out of xn2
.
So InnerXml
is the way to go for xn1
, and InnerText
would be right for xn2
.
Well do it that way then - test for the existence of dataitem
and decide what to do when you know.
XmlNode xn = document.SelectSingleNode("/content[@id=1]/data");
if (xn.SelectSingleNode("dataitem") == null)
Console.WriteLine(xn.InnerXml);
else
Console.WriteLine(xn.InnerText);
To answer your question regarding HttpUtility.HtmlDecode
, I just looked at the implementation and it looks like it would "work for everything", but it seems superfluous to me if the string you are looking for is coming out of InnerXml
.
Upvotes: 2
Reputation: 9881
why not inserting them as < and > ? you avoid mixing xml and custom markup stuff with this...
Upvotes: 2