Reputation: 6863
I have been using .SelectNodes()
for some time, to get the child nodes of a particular node, and it works but as files get bigger it seems to get slower. So I started using .ChildNodes
and I am finding that it gets more than just the child nodes, it goes deeper getting grand child nodes.
Given this
$xml = [Xml]@"
<root>
<one>
<element1>element text</element1>
<element2>element text</element2>
<two>
<element3>element text</element3>
</two>
<name>Name text</name>
</one>
</root>
"@
foreach ($element in $xml.DocumentElement.ChildNodes | where {$_.NodeType -eq 'Element'}) {
Write-Host "$($element.Name) $($element.InnerText)"
}
I would have expected to only get back the single <one>
node, as it is the only child node of root. And yet what I get back is Name text element textelement textelement textName text
which makes no sense at all to me. Especially since I would at least have expected multiple items with the last line being name Name text
. Instead the first item in the line is the last node's name.
Now I know naming a node 'name' is a bad idea, and I am working on code to address that. But even if I change the name of that node, so <nametext>Name text</nametext>
, what I get back is a different kind of wrong, one element textelement textelement textName text
.
So, what am I doing wrong? And, is it even possible to use .ChildNodes
and only get the actual child nodes, no deeper?
And, what IS going on here? As I suspect if I understood WHY the bowl of Petunias said that, I would understand the universe better.
Upvotes: 1
Views: 3006
Reputation: 437718
I would have expected to only get back the single
<one>
node
Indeed that is what you're getting back, though it isn't obvious from the default display formatting.
Instead the first item in the line is the last node's name.
The simplest form to visualize XML nodes is to access their .OuterXml
property, which returns an XML text representation of the node and all of its descendants.
PowerShell's adaptation of the XML DOM represents child elements and attributes of a given element as properties of that element, and such properties shadow (override) type-native properties of the underlying System.Xml.XmlNode
instance:
A <name>
child element (or attribute) therefore shadows the type-native Name
property, which is why Name text
showed up in your output.
.get_Name()
method.As an aside: another fairly common, albeit technically distinct[1] shadowing scenario is where an array of XmlElement
instances, as obtained via member-access enumeration, have <item>
child elements, which can then not be accessed via .item
and require a looping workaround - see this answer.
But even if I change the name of that node, so
<nametext>Name text</nametext>
, what I get back is a different kind of wrong,one element textelement textelement textName text
.
There is nothing wrong with that result: it reflects the target node's element name, one
, followed by direct concatenation of the text nodes it contains, across the sub-node hierarchy (see System.Xml.XmlElement.InnerText
), that is, you're seeing the result of the following string concatenation:
'element text' + 'element text' + 'element text' + 'text'
[1] In member-access enumeration, the type-native properties of an array/collection take precedence over properties of their elements, which is logically the inverse of the XML adaptation on a single XmlElement
, where the adapted property takes precedence, as discussed.
Upvotes: 1