Alvins
Alvins

Reputation: 887

python findall only first occurence

I've got this xml:

enter image description here

with multiple "constituents" tag. I need to iterate over each level:

import sys
from xml.etree import ElementTree as et

base="<ss><cod>cod1</cod><measure><m>1</m></measure><constituents><cod>const1</cod><measure><m>2</m></measure><constituents><cod>const1_1</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>4</m></measure></constituents></constituents><constituents><cod>const1_2</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>42</m></measure></constituents></constituents></constituents></ss>"
tsString = et.fromstring(base)
ss=tsString.getiterator('ss')       
for r in ss:
    measure = risp.findall('.//constituents') #(1) get const1, const1_1, const3, const1_2, const3_2, only needed is const1
    for c in measure:
        measure1 = c.findall('.//constituents') #(2) get const1_1, const3, const1_2, const3_2, only needed are const1_1, const1_2
        ....

But findall returns every occurence of constituents. I need the (1) findall only return "const1" measure's, the (2) only return "const1_1" and "const1_2" and (3) "const3", "const3_2"

How can I fix the 2 findall?

Upvotes: 2

Views: 951

Answers (2)

user_3068807
user_3068807

Reputation: 407

.// gets all the children. Just get the ones in next "step".

import sys
from xml.etree import ElementTree as et

base="<ss><cod>cod1</cod><measure><m>1</m></measure><constituents><cod>const1</cod><measure><m>2</m></measure><constituents><cod>const1_1</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>4</m></measure></constituents></constituents><constituents><cod>const1_2</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>42</m></measure></constituents></constituents></constituents></ss>"
tsString = et.fromstring(base)
ss=tsString.getiterator('ss')
for r in ss:
    measure = r.findall('./constituents') #(1) get const1, const1_1, const3, const1_2, const3_2, only needed is const1
    for t in measure: #for test 
            print t[0].text # for test 
    for c in measure:
        measure1 = c.findall('./constituents') #(2) get const1_1, const3, const1_2, const3_2, only needed are const1_1, const1_2
        for t in measure1: # for test
            print t[0].text # for test 

Upvotes: 0

alecxe
alecxe

Reputation: 473873

Just omit the .// part to perform a non-recursive search in a current node:

for r in ss:
    measure = r.findall('constituents')
    for c in measure:
        measure1 = c.findall('constituents')

Upvotes: 2

Related Questions