Get etree Element with attribute, or containing subelement with attribute

Question

I have an XML file to parse, and I need to find elements by id.

In the example code, I need to find the name of the driver, but I don't know whether my id is for the vehicle, engine, or block. I would like a solution which would work with arbitrary xml inside of vehicle (but existence of driver is guaranteed).


    
        Bob Johnson
        
            V8
            
                Aluminium
            
        
    
    
        Dave Edwards
        
            Inline 6
            
                Cast Iron

What have I tried

I was trying to get the elements by their id, and then, if they weren't vehicle tags, navigate up the tree to find it, but it seems python's elem.find() returns None if the result is outside elem.

Looking at the docs, they have this example:

# Nodes with name='Singapore' that have a 'year' child
root.findall(".//year/..[@name='Singapore']")

But I don't see how to make that work for any descendant, as opposed to a decendant on a specific level.

UltraInstinct · Accepted Answer

Note: All the snippets below use lxml library. To install, run: pip install lxml.

You should use root.xpath(..) not root.findall(..).

>>> root.xpath("//vehicle/driver/text()")
['Bob Johnson', 'Dave Edwards']

If you want to extract driver's name from a given ID, you'd do:

>>> vehicle_id = "16"
>>> xpath("//vehicle[@id='16' or .//*[@id='16']]/driver/text()")
['Bob Johnson']

UPDATE: To get the driver's name for a given id nested at any level deeper, you'd do:

>>> i = '16'
>>> a.xpath("//vehicle[@id='%s' or .//*[@id='%s']]/driver/text()"%(i,i))
['Bob Johnson']
>>> i = '532'
>>> a.xpath("//vehicle[@id='%s' or .//*[@id='%s']]/driver/text()"%(i,i))
['Bob Johnson']
>>> i = '113'
>>> a.xpath("//vehicle[@id='%s' or .//*[@id='%s']]/driver/text()"%(i,i))
['Bob Johnson']

Get etree Element with attribute, or containing subelement with attribute

Answers (2)

Related Questions