Reputation: 887
I've got this xml:
with multiple "constituents" tag. I need to iterate over each level:
import sys
from xml.etree import ElementTree as et
base="<ss><cod>cod1</cod><measure><m>1</m></measure><constituents><cod>const1</cod><measure><m>2</m></measure><constituents><cod>const1_1</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>4</m></measure></constituents></constituents><constituents><cod>const1_2</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>42</m></measure></constituents></constituents></constituents></ss>"
tsString = et.fromstring(base)
ss=tsString.getiterator('ss')
for r in ss:
measure = risp.findall('.//constituents') #(1) get const1, const1_1, const3, const1_2, const3_2, only needed is const1
for c in measure:
measure1 = c.findall('.//constituents') #(2) get const1_1, const3, const1_2, const3_2, only needed are const1_1, const1_2
....
But findall
returns every occurence of constituents.
I need the (1) findall only return "const1" measure's, the (2) only return "const1_1" and "const1_2" and (3) "const3", "const3_2"
How can I fix the 2 findall?
Upvotes: 2
Views: 951
Reputation: 407
.//
gets all the children. Just get the ones in next "step".
import sys
from xml.etree import ElementTree as et
base="<ss><cod>cod1</cod><measure><m>1</m></measure><constituents><cod>const1</cod><measure><m>2</m></measure><constituents><cod>const1_1</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>4</m></measure></constituents></constituents><constituents><cod>const1_2</cod><measure><m>3</m></measure><constituents><cod>const3</cod><measure><m>42</m></measure></constituents></constituents></constituents></ss>"
tsString = et.fromstring(base)
ss=tsString.getiterator('ss')
for r in ss:
measure = r.findall('./constituents') #(1) get const1, const1_1, const3, const1_2, const3_2, only needed is const1
for t in measure: #for test
print t[0].text # for test
for c in measure:
measure1 = c.findall('./constituents') #(2) get const1_1, const3, const1_2, const3_2, only needed are const1_1, const1_2
for t in measure1: # for test
print t[0].text # for test
Upvotes: 0
Reputation: 473873
Just omit the .//
part to perform a non-recursive search in a current node:
for r in ss:
measure = r.findall('constituents')
for c in measure:
measure1 = c.findall('constituents')
Upvotes: 2