Peter Robinson
Peter Robinson

Reputation: 383

Xpath expression to find non-child elements by attribute

here's a nice puzzle. Suppose we have this bit of code:

<page n="1">
 <line n="3">...</line>
</page>

It is real easy to locate the line element "n=3" within the page element "n=1" with a simple xpath expression: xpath(//page[@n='1')/line[@n='3']). Great, beautiful, elegant. Now suppose what we have is this encoding (folks familiar with the TEI will know where this is coming from).

<pb n="1"/>
(arbitrary amounts of stuff)
<lb n="3"/>

We want to find the lb element with n="3", which follows the pb element with n="1". But note -- this lb element could be almost anywhere following the pb: it may not be (and most likely is not) a sibling, but could be a child of a sibling of the pb, or of the pb's parent, etc etc etc.

So my question: how would you search for this lb element with n="3", which follows the pb element with n="1", with XPath?

Thanks in advance

Peter

Upvotes: 2

Views: 136

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243459

Use:

  //pb[@n='1']/following::lb[@n='2']
|
  //pb[@n='1']/descendant::lb[@n='2']

This selects any lb element that follows the specified pb in document order -- even if the wanted lb element is a descendant of the pb element.

Do note that the following expression doesn't in general select all wanted lb elements (it fails to select any of these that are descendants of the pb element):

  //pb[@n='1']/following::lb[@n='2']

Explanation:

As defined in the W3C XPath specification, the following:: and descendant:: axes are non-overlapping:

"the following axis contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes"

Upvotes: 2

oxc
oxc

Reputation: 687

That would be

//pb[@n=1]/following::lb[@n=3]

Upvotes: 1

Related Questions