Reputation: 139
I have code in a XML file, which I parse using et.parse:
<VIAFCluster xmlns="http://viaf.org/viaf/terms#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:void="http://rdfs.org/ns/void#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<viafID>15</viafID>
<nameType>Personal</nameType>
</VIAFCluster>
<mainHeadings>
<data>
<text>
Gondrin de Pardaillan de Montespan, Louis-Antoine de, 1665-1736
</text>
</data>
</mainHeadings>
and I want to parse it as:
[15, "Personal", "Gondrin etc."]
I can't seem to print any of the string information with:
import xml.etree.ElementTree as ET
tree = ET.parse('/Users/user/Documents/work/oneline.xml')
root = tree.getroot()
for node in tree.iter():
name = node.find('nameType')
print(name)
as it appears as 'None' ... what am I doing wrong?
Upvotes: 0
Views: 272
Reputation: 4304
I'm still not sure exactly what you are wanting to do, but hopefully if you run the code below, it will help get you on your way. Using the getiterator function to iter through the elements will let you see what's going on. You can pick up the stuff you want as you come to them:
import xml.etree.ElementTree as et
xml = '''
<VIAFCluster xmlns="http://viaf.org/viaf/terms#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<viafID>15</viafID>
<nameType>Personal</nameType>
<mainHeadings>
<data>
<text>
Gondrin de Pardaillan de Montespan, Louis-Antoine de, 1665-1736
</text>
</data>
</mainHeadings>
</VIAFCluster>
'''
tree = et.fromstring(xml)
lst = []
for i in tree.getiterator():
t = i.text.strip()
if t:
lst.append(t)
print i.tag
print t
You will end up with a list as you wanted. I had to clean up your xml because you had more than one top level element, which is a no-no. Maybe that was your problem all along.
good luck, Mike
Upvotes: 1