Mallachar
Mallachar

Reputation: 241

Parse XML with Python when multiple children share a name

I currently have an XML file I am trying to parse. Here is my code thus far.

from xml.etree import ElementTree

with open('data.xml', 'rt') as f:
    tree = ElementTree.parse(f)

for node in tree.iter('Host'):
    hostname = node.find('Name').text
    ip = node.find('Networking/IP').text
    print hostname
    print ip

However, I am running into an issue because all of these devices have 3 IP addresses, so there are multiple XML "children" with the exact same name. Here is the sample (actual hostname obstructed)

<?xml version="1.0" encoding="UTF-8"?>
<APIResponse>
  <HostRecords>
    <Type>Dedicated</Type>
      <Host>
        <Name>dc-01-a.domain.com</Name>
        <Active>1</Active>
        <Networking>
          <Primary>Yes</Weight>
          <IP>10.0.8.72</IP>
        </Networking>
        <Networking>
          <Primary>No</Weight>
          <IP>10.12.12.1</IP>
        </Networking>
        <Networking>
          <Primary>Yes</Weight>
          <IP>fd30:0000:0000:0001:ff4e:003e:0009:000e</IP>
        </Networking>
      </Host>
    </Type>
  </HostRecords>
</APIResponse>

So my test script pulls the first IP, but how do I pull the next two IPs? Since 'Networking/IP' is the exact same thing in 3 spots, but it will only pull one. Also, How would I make it so that it only grabs IPs that are labeled as Primary?

EDIT: If I try with findall instead of find I get

AttributeError: 'list' object has no attribute 'text'

If I remove the text part I get

[<Element 'RData' at 0x10ef67650>, <Element 'RData' at 0x10ef67750>, <Element 'RData' at 0x10ef67850>]

So it returns, but not as the actual readable data.

Upvotes: 0

Views: 5161

Answers (1)

James
James

Reputation: 3411

The find method can accept some limited Xpath expressions, you can use this to extract only IPs which are marked as Primary:

from xml.etree import ElementTree
tree = ElementTree.fromstring(sample)

for node in tree.iter('Host'):
    hostname = node.find('Name').text
    ips = node.findall("Networking[Primary='Yes']/IP")
    print hostname
    for ip in ips:
        print ip.text

For further information on what XPath expressions are allowed see the documentation at: https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element


The sample XML provided in the question is malformed in a couple of areas (presumably when it was obfuscated for posting, or the code example given could never have worked). The Type tag is closed twice, and the Primary tags are mismatched with closing Weight tags

Upvotes: 2

Related Questions