Xpath node-set nesting order selection

Question

Is there an Xpath 1.0 expression that I could use starting at the div[@id='rootTag'] context to select the different nested span descendants based on how deep they are nested? For example could you use something like span[2] to select the second most deeply nested span tag rather than second span child of the same parent element?

Jack Fleeting · Accepted Answer

It's a bit (a lot...) of a hack, but it can be done this way:

Assume your html is like this:

levels = """
  Level2
     
    Level3
    
     Level4
    
  
    
    Level3
  
    
    
        
        
          Level6
        
        Level5
      
    
  
"""

We then do this:

#First collect the data:
from lxml import etree #you have to make sure your html is well-formed, or it won't work
root = etree.fromstring(levels)
tree = etree.ElementTree(root)

#collect the paths of all  elements
paths = [tree.getpath(e) for e in root.iter('span')]

#determine the nesting level of each  element
nests = [e.count('/') for e in paths] #or, alternatively:
#nests = [tree.getpath(e).count('/') for e in root.iter('span')]

From here, we use the nesting level in the nests list to extract the comparable element in the paths list. For example, to get the element with the deepest nesting level:

deepest = nests.index(max(nests))
print(paths[deepest],root.xpath(paths[deepest])[0].text)

Output:

/div/div[3]/div/div/div/span Level6

Or to extract the element with a level 4 nesting:

print(paths[nests.index(4)],root.xpath(paths[nests.index(4)])[0].text)

Output:

/div/div[1]/div/span Level4

Xpath node-set nesting order selection

Answers (1)

Related Questions