Reputation: 39
I have a complex xml I'm trying to extract data from.
<?xml version="1.0" ?>
<root xmlns="something.something.com">
<Save>
<AdditionalInfo>
<Name></Name>
<Time></Time>
<UtilityVersion></UtilityVersion>
<XMLVersion></XMLVersion>
<PluginName></PluginName>
<ClassName></ClassName>
</AdditionalInfo>
<Data>
<session>
<xyDataObjects>
<xyData Key="'info'" ObjectType="moreinfo" Type="evenmoreinfo">
<axis1QuantityType ObjectType="guesswhat" Type="info!">
<label></label>
<type></type>
</axis1QuantityType>
... and so on and so on
The file has multiple blocks starting and ending with the Save and /Save blocks and the info I'm looking for can be as far as the label, or even farther.
ElementTree.Iter seemed to be my solution as it would iterate through every Save block and find the <label>
info I am looking for, but unfortunately, it doesn't accept a namespace argument.
What are my other options? I'm trying to keep my code flexible, as I foresee that the structure of the xml file could change in the future, and simple so I would rather not implement something like:
tree= ET.parse('dblank.xml')
root = tree.getroot()
for i in range(len(root)):
Array[i]=root[i][1][0][0][0][0][0].text
Upvotes: 1
Views: 2442
Reputation: 30971
When you process XML with namespaces, you must specify the namespaces used. To this end I:
Note also that the first argument of findall contains some: as the initial part of the element name.
Try the following code:
import xml.etree.ElementTree as et
tree = et.parse('Input.xml')
root = tree.getroot()
ns = {'some': 'something.something.com'}
for elem in root.findall('.//some:label', ns):
print(elem.text)
Of course, this is only an example of how to refer to an existing element. Change it according to your needs.
Upvotes: 2