Reputation: 147
I been using ElementTree to read XML file, and were able to parse XML properly. But I don't know how to read comment, especially where child context is important. In this specific case, I like to read comment for NY, that air, bus and trail is avilable and store it in a dictionary (name:comment).
<spirit: st>
.....
<spirit:fa>
<spirit:name>NY</spirit:name>
<spirit:den>3</spirit:bitWidth>
<spirit:metro>true</spirit:metro>
<!-- air, bus, train all available -->
<spirit:access>air</spirit:access>
</spirit:fa>
.....
My code:
for state in data.findall('spirit:st', IPXACT_MAP):
for city in state.findall('spirit:fa', IPXACT_MAP):
access = city.find('spirit:access', IPXACT_MAP)
#read comment and set city_access_d[city.text] = comment
Upvotes: 0
Views: 62
Reputation: 52858
If you can use lxml, you should be able to select the comment() with XPath.
Here's an example. I've removed the namespace prefixes to simplify it.
from lxml import etree
xml = """
<st>
<fa>
<name>NY</name>
<den>3</den>
<!-- ignore me -->
<metro>true</metro>
<!-- air, bus, train all available -->
<access>air</access>
</fa>
</st>
"""
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.fromstring(xml, parser=parser)
city_access_d = {}
for city in tree.xpath(".//fa"):
name = city.xpath("name")[0].text
comment = city.xpath("comment()[following-sibling::node()[1][self::access]]")[0]
city_access_d[name] = comment.text.strip()
print city_access_d
printed output...
{'NY': 'air, bus, train all available'}
You could also use the following XPath if for some reason you didn't want to create the XMLParser...
comment = city.xpath("comment()[following-sibling::node()[not(self::text())][1][self::access]]")[0]
Upvotes: 1