Yongwei Xing
Yongwei Xing

Reputation: 13441

Python XML: get direct child nodes

I have a xml like below. I want to get all direct child nodes under Node1. I am trying to use childNodes, however, it return Node21 and Node22 also. How can I just get those direct chirld nodes

<Node1>
  <Node11>
    <Node21>
    </Node21>
    <Node22>
    </Node22>
    <Node23>
    </Node23>
  </Node11>
  <Node12>
  </Node12>
  <Node13>
  </Node13>
</Node1>

UPDATE Sorry for the confusion. I made a mistake, it seems it only get the direct child nodes. However, the item number is the childnodes still exceeds the real child nodes. I try to get the nodeName. I get a lot of "#text"

Upvotes: 2

Views: 4920

Answers (2)

Eli Bendersky
Eli Bendersky

Reputation: 273416

xml.ElementTree.Element supports the iterator protocol, so you can use list(elem) as follows:

import xml.etree.cElementTree as ET

s = '''
<Node1>
  <Node11>
    <Node21>
    </Node21>
    <Node22>
    </Node22>
    <Node23>
    </Node23>
  </Node11>
  <Node12>
  </Node12>
  <Node13>
  </Node13>
</Node1>
'''

root = ET.fromstring(s)

print root
print list(root)

Upvotes: 4

Josiah
Josiah

Reputation: 3326

There are two ways you can go about handling the text nodes. If you really want to keep using dom, you can get rid of the text nodes with a filter:

>>> filter(lambda node: node.nodeType != xml.dom.Node.TEXT_NODE, myNode.childNodes)
[<DOM Element: Node11 at 0x18e64d0>, <DOM Element: Node12 at 0x18e6950>, <DOM Element: Node13 at 0x18e6a70>]

or a list comprehension:

>>> [x for x in myNode.childNodes if x.nodeType != xml.dom.Node.TEXT_NODE]
[<DOM Element: Node11 at 0x18e64d0>, <DOM Element: Node12 at 0x18e6950>, <DOM Element: Node13 at 0x18e6a70>]

If you don't need to keep using dom, I would suggest using ElementTree as Eli Bendersky suggested.

Upvotes: 1

Related Questions