CronosVirus00
CronosVirus00

Reputation: 129

Python XML: Looping through children to get separated values for each child

I have this xml file:

<SESSION_INFO>
<start_time>2018-10-16 22:44:38.36 -0500</start_time>
</SESSION_INFO>
<ALL_INSTANCES>
<instance>
<ID>1</ID>
<start>4.3974745990</start>
<end>13.6332131403</end>
<code>Button 013</code>
<label>
<text>1,2</text>
</label>
<label>
<text>0,4</text>
</label>
<label>
<text>2,3</text>
</label>
</instance>
<instance>
<ID>2</ID>
<start>513.0491021980</start>
<end>524.9834182373</end>
<code>Button 013</code>
<label>
<text>1,2</text>
</label>
<label>
<text>1,4</text>
</label>
<label>
<text>1,3</text>
</label>
<label>
<text>0,1</text>
</label>
<label>
<text>1,3</text>
</label>
<label>
<text>0,4</text>
</label>
</instance>
</ALL_INSTANCES>

I wrote a code to extract all the data from /label/text and put it in a list:

import xml.etree.ElementTree as ET
tree= ET.parse('/Desktop/XML Edit list.xml')
root = tree.getroot()

labels = []
for each in root.findall('.//ALL_INSTANCES/instance/label'):

    rating = each.find('.//text');
    print 'Empity' if rating is None else labels.append(rating.text);

print(labels)

Next step, where I can't get my head around it, is to create a list for all the in each instance (2 in this example). Now, I feel like I need to use a for loop to go into each , pull out the data and write into a list that will be appended to labels[]. However, I cannot go through each instance separately; the .find and .get loop did not get me any far... and it was my best shot.

Thank you in advance for your help, Cronos

EDIT 1 Adding ideal output as per request:

[['1,2', '0,4', '2,3'], ['1,2', '1,4', '1,3', '0,1', '1,3', '0,4']]

EDIT 2 Before, I have achieved this adding another list inside the loop that will first append to all_lables and then it resets in order to get the other values for the next instance. Something like:

all_labels = []
result = []
for child in root.iter():
    for instance in child.findall('instance'):
        for label in instance.findall('label'):
            all_labels = []
            for val in label.findall('text'):
                all_labels.append(val.text)
                result.append(all_labels)

But I canont make it work

EDIT 3 Almost got it, thanks to LeKhan9 who showed a simpler approach; based on his idea, I created another list that will save the result of each loop; the output contains an empty value so it is not "clean":

all_labels = []
result = []
for child in root.iter():    
    for instance in child.findall('instance'):        
        result.append(all_labels)    
        all_labels = []
        for label in instance.findall('label'):            
            for val in label.findall('text'):
                all_labels.append(val.text)

result.append(all_labels)

print result
[[], ['1,2', '0,4', '2,3'], ['1,2', '1,4', '1,3', '0,1', '1,3', '0,4']]

Upvotes: 0

Views: 531

Answers (1)

LeKhan9
LeKhan9

Reputation: 1350

You can always take a deliberate approach and parse each level of the tree as such:

from xml.etree import ElementTree as ET


tree = ET.parse('test.xml')
root = tree.getroot()

all_labels = []
for child in root.iter():
    for instance in child.findall('instance'):
        for label in instance.findall('label'):
            for val in label.findall('text'):
                all_labels.append(val.text)

print all_labels

output:

['1,2', '0,4', '2,3', '1,2', '1,4', '1,3', '0,1', '1,3', '0,4']

Updating based on OPs expected output:

from xml.etree import ElementTree as ET


tree = ET.parse('test.xml')
root = tree.getroot()

result = []
for child in root.iter():
    for instance in child.findall('instance'):
        current_labels = []
        for label in instance.findall('label'):
            for val in label.findall('text'):
                current_labels.append(val.text)
        result.append(current_labels)

print result

Output:

[['1,2', '0,4', '2,3'], ['1,2', '1,4', '1,3', '0,1', '1,3', '0,4']]

Upvotes: 1

Related Questions