pybeginner
pybeginner

Reputation: 55

Python XML get immediate child elements only

I have an xml file as below:

<?xml version="1.0" encoding="utf-8"?>
<EDoc CID="1000101" Cname="somename" IName="iname" CSource="e1" Version="1.0">
<RIGLIST>
    <RIG RIGID="100001" RIGName="RgName1">
          <ListID>
            <nodeA nodeAID="1000011" nodeAName="node1A" nodeAExtID="9000011" />
            <nodeA nodeAID="1000012" nodeAName="node2A" nodeAExtID="9000012" />
            <nodeA nodeAID="1000013" nodeAName="node3A" nodeAExtID="9000013" />
            <nodeA nodeAID="1000014" nodeAName="node4A" nodeAExtID="9000014" />
            <nodeA nodeAID="1000015" nodeAName="node5A" nodeAExtID="9000015" />
            <nodeA nodeAID="1000016" nodeAName="node6A" nodeAExtID="9000016" />
            <nodeA nodeAID="1000017" nodeAName="node7A" nodeAExtID="9000017" />
          </ListID>
        </RIG>
    <RIG RIGID="100002" RIGName="RgName2">
          <ListID>
            <nodeA nodeAID="1000021" nodeAName="node1B" nodeAExtID="9000021" />
            <nodeA nodeAID="1000022" nodeAName="node2B" nodeAExtID="9000022" />
            <nodeA nodeAID="1000023" nodeAName="node3B" nodeAExtID="9000023" />
          </ListID>
        </RIG>
</RIGLIST>
</EDoc>

I need to search for the Node value RIGName and if match is found print out all the values of nodeAName

Example: Searching for RIGName = "RgName2" should print all the values as node1B, node2B, node3B

As of now I am only able to get the first part as below:

import xml.etree.ElementTree as eT
import re

xmlfilePath  = "Path of xml file"

tree = eT.parse(xmlfilePath)
root = tree.getroot()

for elem in root.iter("RIGName"):
        # print(elem.tag, elem.attrib)
            if re.findall(searchtxt, elem.attrib['RIGName'], re.IGNORECASE):
                print(elem.attrib)
                count += 1

How can I get only the immediate child node values?

Upvotes: 2

Views: 1950

Answers (1)

alecxe
alecxe

Reputation: 473853

Switching from xml.etree to lxml would give you a way to do it in a single go because of a much better XPath query language support:

In [1]: from lxml import etree as ET

In [2]: tree = ET.parse('input.xml')

In [3]: root = tree.getroot()

In [4]: root.xpath('//RIG[@RIGName = "RgName2"]/ListID/nodeA/@nodeAName')
Out[4]: ['node1B', 'node2B', 'node3B']

Upvotes: 1

Related Questions