Reputation: 2761
I'm having a difficult time determining while iterating each node in an XML document (recursively) determining if the current node has a value, or, if it has embedded XML.
It seems that XmlNode.NodeType is set to Element in both cases, and in cases where the XML has a value (and not more XML) the ChildNodes.Count is not null (actually, it's 1).
A simple XML file I'm using for testing is:
<note>
<to>You</to>
<from>Me</from>
<subject>Hello!</subject>
<body>Check out this cool data!</body>
<data>
<name>Something cool</name>
<location>Mars</location>
<distance>54 million kilometers</distance>
</data>
</note>
Each of the XmlNodes above is 'Element' and with ChildNodes >= 1.
What can I use to reliably test if an XmlNode should be treated as a container (like note and data) or as holding a value (like to, from, subject, body, name, location, distance)?
Upvotes: 1
Views: 169
Reputation: 70
Check out the answers from this post to see if it gets you going in the right direction:
How to get "real" ChildNodes of XmlNode, ignoring whitespace nodes?
Upvotes: 0
Reputation: 35464
From your example, you could check for the 1st child node being of type Element
.
bool isContainer(XmlNode node) {
return node.ChildNodes.Count > 0 && node.ChildNodes[0].NodeType == XmlNodeType.Element;
}
Note that this will not handle mixed content data.
Upvotes: 0
Reputation: 2099
I don't know if you can use System.Xml.Linq.XElement
instead of XmlDocument
here but if you can, you can go about this the following way:
var xml = XElement.Parse("<note> .... </note>");
then
xml.Elements().Count()
returns 5
the correct number of subnodes, whereas
xml.Elements().First().Elements().Count()
returns 0
because the to
node has zero children...
Upvotes: 1
Reputation: 100630
Usually you know what nodes contain values by knowing structure of XML.
If you need to infer that information from XML of any structure - text represented by TEXT and CDATA nodes so you can check if element has only children of those types to get "text only" nodes. See How to get text inside an XmlNode.
Some gotcha to be aware of/make decisions about:
<r>foo <v/> bar</r>
) - decide what you want to do with them. I.e. nodes with HTML content generally contain "mixed content".<r> <n/> </r>
). You should ignore those unless you must preserve document formattingUpvotes: 1