Fluxy
Fluxy

Reputation: 2978

How to retrieve all values of a specific attribute from sub-elements that contain this attribute?

I have the following XML file:

<main>
  <node>
    <party iot="00">Big</party>
    <children type="me" value="3" iot="A">
       <p>
          <display iot="B|S">
             <figure iot="FF"/>
          </display>
       </p>
       <li iot="C"/>
       <ul/>
    </children>
  </node>
  <node>
    <party iot="01">Small</party>
    <children type="me" value="1" iot="N">
       <p>
          <display iot="T|F">
             <figure iot="MM"/>
          </display>
       </p>
    </children>
  </node>
</main>

How can I retrieve all values of iot attribute from sub-elements of children of the first node? I need to retrieve the values of iot as a list.

The expected result:

iot_list = ['A','B|S','FF','C']

This is my current code:

import xml.etree.ElementTree as ET

mytree = ET.parse("file.xml")
myroot = mytree.getroot()
list_nodes = myroot.findall('node')
for n in list_nodes:
   # ???

Upvotes: 1

Views: 106

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24930

This is easier to do using the lxml library:

If the sample xml in your question represents the exact structure of the actual xml:

from lxml import etree
data = """[your xml above]"""
doc = etree.XML(data)

print(doc.xpath('//node[1]//*[not(self::party)][@iot]/@iot'))

More generically:

for t in doc.xpath('//node[1]//children'):
    print(t.xpath('.//descendant-or-self::*/@iot'))

In either case, the output should be

['A', 'B|S', 'FF', 'C']

Upvotes: 1

Related Questions