Reputation: 3382
Suppose this is my XML:
<animals>
<mammals>
<an>dog</an>
<an>cat</an>
</mammals>
<reptiles>
<an>snake</an>
</reptiles>
</animals>
What I want is to get tuples like that using xpath
:
(mammals,dog)
(mammals,cat)
(reptiles,snake)
To get each of them separately, or both of them with 2 queries is easy. I was wondering if there is a way to get it (or very similar output) in 1 xpath query.
Any help will be appreciated!
Upvotes: 3
Views: 1019
Reputation: 89305
In XPath 2.0 or above you can use for
construct (demo) :
for $x in /animals/*/*
return concat($x/parent::*/name(), ',', $x/text())
But in lxml
, which only supports XPath 1.0, we need to replace it with python's for
loop :
from lxml import etree
raw = """<animals>
<mammals>
<an>dog</an>
<an>cat</an>
</mammals>
<reptiles>
<an>snake</an>
</reptiles>
</animals>"""
root = etree.fromstring(raw)
for x in root.xpath("/animals/*/*"):
print (x.getparent().tag, x.text)
Upvotes: 2
Reputation: 12777
This xpath returns the requested string but only for the first element. Could be hard to do with pure XPath
'concat("(", local-name(//animals/*), ",", //animals/*/an/text(), ")")'
xmllint --xpath 'concat("(", local-name(//animals/*), ",", //animals/*/an/text(), ")")' ~/tmp/test.xml
(mammals,dog)
Upvotes: 0
Reputation: 11695
Try using xml
module in python
from xml.etree import ElementTree
def parse_data(xml_str):
output = []
tree = ElementTree.fromstring(xml_str)
for m in tree.getchildren():
for n in m.getchildren():
output.append((m.tag, n.text,))
return output
xml_str = '''
<animals>
<mammals>
<an>dog</an>
<an>cat</an>
</mammals>
<reptiles>
<an>snake</an>
</reptiles>
</animals>'''
print parse_data(xml_str)
# output: [('mammals', 'dog'), ('mammals', 'cat'), ('reptiles', 'snake')]
Upvotes: 1
Reputation:
Use lxml:
from io import StringIO
from lxml import etree
xml = """<animals>
<mammals>
<an>dog</an>
<an>cat</an>
</mammals>
<reptiles>
<an>snake</an>
</reptiles>
</animals>"""
tree = etree.parse(StringIO(xml))
for x in tree.xpath("/animals/*"):
for y in x:
print((x.tag, y.text))
Output:
('mammals', 'dog')
('mammals', 'cat')
('reptiles', 'snake')
Upvotes: 3