Reedy
Reedy

Reputation: 77

Python - Parse XML with repeated tags using ElementTree

I have the following XML content:

<plist version="1.0">
<dict>
    <key>Version</key><integer>1</integer>
    <key>Sub Version</key><integer>2</integer>
    <dict>
        <key>1</key>
        <dict>
            <key>ID</key><integer>1</integer>
            <key>Name</key><string>Frank</string>
        </dict>
        <key>2</key>
        <dict>
            <key>ID</key><integer>2</integer>
            <key>Name</key><string>Richard</string>
        </dict>
        <key>3</key>
        <dict>
            <key>ID</key><integer>3</integer>
            <key>Name</key><string>Sophia</string>
        </dict>
    </dict>
    <key>Persons</key>
    <array>
        <dict>
            <key>Name</key><string>Persons</string>
            <key>Description</key><string>empty</string>
        </dict>
    </array>
</dict>
</plist>

I'm having a hard time retrieving the names since this XML tags names are all the same and have no attributes. So far I've tried to access it using iteration over the "second depth dict" but I can't retrieve just what I want.

What I got:

from xml.etree import ElementTree as et

tree = et.parse("file.xml")
root = tree.getroot()

for i in root.find('dict').find('dict').iter('dict'):
    print ([j.text for j in i])

The output I want:

Frank
Richard
Sophia

Does anyone know how to access these values with such tags?

Upvotes: 1

Views: 856

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24928

Try it using lxml instead:

from lxml import etree
plist = """your xml above"""

doc = etree.fromstring(plist)
doc.xpath('//dict/dict/key["name"]/following-sibling::string/text()')

output:

['Frank', 'Richard', 'Sophia']

Upvotes: 1

Related Questions