ruben1691
ruben1691

Reputation: 423

Find all attributes in an XML using Beautiful Soup

I have an XML file which looks something like this:

<tagA key1="val1" key2="val2" key3="val3">
<tagB.1 key1="val1" key2="val2" key3="val3"/>
<tagB.2 key1="val1" key2="val2" key3="val3"/>
<tagB.3 key1="val1" key2="val2" key3="val3"/>
<tagB.4 key1="val1" key2="val2" key3="val3"/>
<tagB.5 key1="val1" key2="val2" key3="val3"/>
</tagA>

What I am trying to do is extract the name of key1, key2 and key3 in tagB.x, and put them into a list. This way I can extract the values of it later. It should be able to handle more or less elements, being as each file is different. Thanks!

Upvotes: 0

Views: 710

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180512

You should use an xml parser:

xml="""
<tagA key1="val1" key2="val2" key3="val3">
<tagB.1 key1="val1" key2="val2" key3="val3"/>
<tagB.2 key1="val1" key2="val2" key3="val3"/>
<tagB.3 key1="val1" key2="val2" key3="val3"/>
<tagB.4 key1="val1" key2="val2" key3="val3"/>
<tagB.5 key1="val1" key2="val2" key3="val3"/>
</tagA>
"""


import xml.etree.ElementTree as ET

root = ET.fromstring(xml)
for child in root:
    print child.tag, child.attrib.keys()

tagB.1 ['key3', 'key2', 'key1']
tagB.2 ['key3', 'key2', 'key1']
tagB.3 ['key3', 'key2', 'key1']
tagB.4 ['key3', 'key2', 'key1']
tagB.5 ['key3', 'key2', 'key1']

Upvotes: 2

Related Questions