Python, lxml - get a sibling tag's (grand)child's text

Question

I have an XML to parse which is proving really tricky for me.

I would like to iterate through this XML and locate all the id values inside of the bitstreams for a bundle where the name element's value is 'FOO'. I'm not interested in any bundles not named 'FOO', and there may be any number of bundles and any number of bitstreams in the bundles.

I have been using tree.findall('./bundle/name') to find the FOO bundle but this just returns a list that I can't step through for the id values:

for node in tree.findall('./bundle/name'):
if node.text == 'FOO':
 id_values = tree.findall('./bundle/bitstreams/bitstream/id')
 for value in id_values:
     print value.text

This prints out all the id values, not those of the bundle 'FOO'.

How can I iterate through this tree, locate the bundle with the name FOO, take this bundle node and collect the id values nested in it? Is the XPath argument incorrect here?

I'm working in Python, with lxml bindings - but any XML parser I believe would be alright; these aren't large XML trees.

Captain Barbossa · Accepted Answer

You can use xpath to achieve the purpose. Following Python code works perfect:

import libxml2
data = """

  
    
      
        1234
      
    
    FOO
  

"""
doc = xmllib2.parseDoc(data)
for node in doc.xpathEval('/bundles/bundle/name[.="FOO"]/../bitstreams/bitstream/id'):
    print node

or using lxml (data is the same as in the example above):

from lxml import etree

bundles = etree.fromstring(data)

for node in bundles.xpath('bundle/name[.="FOO"]/../bitstreams/bitstream/id'):
    print(node.text)

outputs:

If the element always precedes the element, you can also use the more efficient xpath expression:

'bundle/name[.="FOO"]/preceding-sibling::bitstreams/bitstream/id'

Python, lxml - get a sibling tag's (grand)child's text

Answers (2)

Related Questions

Python, lxml - get a sibling tag&#39;s (grand)child&#39;s text

Answers (2)

Related Questions

Python, lxml - get a sibling tag's (grand)child's text