leopardxpreload
leopardxpreload

Reputation: 768

How to Iterate through XML tree children from chosen parent?

I want to iterate through my XML tree and retrieve all children attributes from selected parent. This is my parsing setup:

import xml.etree.ElementTree as ET

file_name = 'myXML.xml'
tree = ET.parse(file_name)
root = tree.getroot()

The function I have uses a for loop, but you will need a for loop per layer of generations: Essentially, each parent loops through each child and returns .tag, .text, & .attrib:

Is there a method of looping through and collecting all this data without knowing the number of layers?

def data_dump(k, mD, st):
    for na in mD.iter(k):
        for a in na:
            print(st + '> a:: ', a.tag., a.text, a.attrib)
            for b in a:
                print('|-->', ' b:: ', b.tag, b.text, b.attrib)
                for c in b:
                    print('|---->', ' c:: ', c.tag, c.text, c.attrib)
                    for d in c:
                        print('|------>', ' d:: ', d.tag, d.text, d.attrib)

These are my test cases:

data_dump('Title', root, 'TITLE')
data_dump('Comment', root, 'COM')
data_dump('Steps', root, 'STEP')
data_dump('Transitions', root, 'TRANS')
data_dump('Branches', root, 'BRAN')
data_dump('Connections', root, 'CONN')
data_dump('Sequence', root, 'SEQ')

Upvotes: 0

Views: 3421

Answers (1)

AzyCrw4282
AzyCrw4282

Reputation: 7754

Your implementation is very inefficient. The use of five loops could drastically increase the run time as the elements in the XML grows. In other words, O(n^5) is simply terrible.

What I would recommend for your problem is to use XPath, read more here.

import xml.etree.ElementTree as ET
root = ET.parse(filename)
result = ''

for elem in root.findall('.//child/grandchild'):
    if elem.attrib.get('name') == 'foo':
        result = elem.text
        break

You can incorporate the same idea into your function and turn it into something like

def data_dump(element,value):
    for elem in root.findall('.//parent/'+element):
        if elem.attrib.get('name') == value:
            text,attrib,tag = elem.text,elem.attrib,elem.tag
            break

Upvotes: 1

Related Questions