Reputation: 768
I want to iterate through my XML tree and retrieve all children attributes from selected parent. This is my parsing setup:
import xml.etree.ElementTree as ET
file_name = 'myXML.xml'
tree = ET.parse(file_name)
root = tree.getroot()
The function I have uses a for loop, but you will need a for loop per layer of generations:
Essentially, each parent loops through each child and returns .tag
, .text
, & .attrib
:
Is there a method of looping through and collecting all this data without knowing the number of layers?
def data_dump(k, mD, st):
for na in mD.iter(k):
for a in na:
print(st + '> a:: ', a.tag., a.text, a.attrib)
for b in a:
print('|-->', ' b:: ', b.tag, b.text, b.attrib)
for c in b:
print('|---->', ' c:: ', c.tag, c.text, c.attrib)
for d in c:
print('|------>', ' d:: ', d.tag, d.text, d.attrib)
These are my test cases:
data_dump('Title', root, 'TITLE')
data_dump('Comment', root, 'COM')
data_dump('Steps', root, 'STEP')
data_dump('Transitions', root, 'TRANS')
data_dump('Branches', root, 'BRAN')
data_dump('Connections', root, 'CONN')
data_dump('Sequence', root, 'SEQ')
Upvotes: 0
Views: 3421
Reputation: 7754
Your implementation is very inefficient. The use of five loops could drastically increase the run time as the elements in the XML grows. In other words, O(n^5) is simply terrible.
What I would recommend for your problem is to use XPath, read more here.
import xml.etree.ElementTree as ET
root = ET.parse(filename)
result = ''
for elem in root.findall('.//child/grandchild'):
if elem.attrib.get('name') == 'foo':
result = elem.text
break
You can incorporate the same idea into your function and turn it into something like
def data_dump(element,value):
for elem in root.findall('.//parent/'+element):
if elem.attrib.get('name') == value:
text,attrib,tag = elem.text,elem.attrib,elem.tag
break
Upvotes: 1