Bogota
Bogota

Reputation: 391

Fetch xml tag values recursively using ElementTree

I have an xmk of the type:

<SCHOOL>
    <GROUP name="GetStudInfo">

        <DATA>
            <NAME type="char">Sahil Jha</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Rashmi Kaur</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Palak Bisht</NAME>
            <STD>11th</STD>
        </DATA>
</SCHOOL>

I need to fetch the values of NAME, STD. I tried doing this:

e = ET.ElementTree(ET.fromstring(getunitinfo_str))
    for elt in e.iter():
        print("{} {}".format(elt.tag, elt.text))

But this was covering other values as well: Output:

SCHOOL

GROUP


DATA

NAME Sahil Jha
STD 11th
DATA

NAME Rashmi Kaur
STD 11th
DATA

NAME Palak Bisht
STD 11th
{}

Expected O/p:

{'Sahil Jha':'11th', 'Rashmi Kaur'::'11th', 'Palak Bisht':'11th'}

But the formatting should be of the type NAME:STD. Where am I going wrong?

Upvotes: 0

Views: 111

Answers (3)

balderman
balderman

Reputation: 23815

One liner below

import xml.etree.ElementTree as ET

xml = '''<SCHOOL>
    <GROUP name="GetStudInfo">
        <DATA>
            <NAME type="char">Sahil Jha</NAME>
            <STD>11th</STD>
        </DATA>
        <DATA>
            <NAME type="char">Rashmi Kaur</NAME>
            <STD>116th</STD>
        </DATA>
        <DATA>
            <NAME type="char">Palak Bisht</NAME>
            <STD>17th</STD>
        </DATA>
</GROUP>
</SCHOOL>'''

root = ET.fromstring(xml)
data = {x.find("NAME").text: x.find("STD").text for x in root.findall('.//DATA')}
print(data)

output

{'Sahil Jha': '11th', 'Rashmi Kaur': '116th', 'Palak Bisht': '17th'}

Upvotes: 0

furas
furas

Reputation: 142631

You need something more then only print() - you need if/else to check elt.tag to get only NAME and `STD.

Because NAME and STD are different tags so you will have to remeber NAME in some variable to use it when you get STD

name = None  # default value at start

for elt in e.iter():
    if elt.tag == 'NAME':
        name = elt  # remember element
    if elt.tag == 'STD':
        print("{}:{}".format(name.text, elt.text))

Or you could use xpath like in @qouify answer.


Minimal working code

getunitinfo_str = '''
<SCHOOL>
    <GROUP name="GetStudInfo">

        <DATA>
            <NAME type="char">Sahil Jha</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Rashmi Kaur</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Palak Bisht</NAME>
            <STD>11th</STD>
        </DATA>
    </GROUP>
</SCHOOL>
'''

import xml.etree.ElementTree  as ET

e = ET.ElementTree(ET.fromstring(getunitinfo_str))

name = None # to remeber element

for elt in e.iter():
    if elt.tag == 'NAME':
        name = elt
    if elt.tag == 'STD':
       print("{}:{}".format(name.text, elt.text))

Upvotes: 1

qouify
qouify

Reputation: 3900

As mentionned by @furas you can use XPATH to find all DATA elements and then find NAME and STD elements:


import xml.etree.ElementTree as ET

xml = '''<SCHOOL>
    <GROUP name="GetStudInfo">

        <DATA>
            <NAME type="char">Sahil Jha</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Rashmi Kaur</NAME>
            <STD>11th</STD>
        </DATA>

        <DATA>
            <NAME type="char">Palak Bisht</NAME>
            <STD>11th</STD>
        </DATA>
</GROUP>
</SCHOOL>'''


e = ET.fromstring(xml)
for data_tag in e.findall('DATA'):
    name = data_tag.find('NAME')
    std = data_tag.find('STD')
    print("{} {}".format(name.text, std.text))

Or you can use a dict comprehension to get the dictionary you want:

my_dict = {
    data_tag.find('NAME').text: data_tag.find('STD').text
    for data_tag in e.findall('.//DATA')
}
print(my_dict)

Upvotes: 2

Related Questions