Synthead
Synthead

Reputation: 2321

Using XPath 1.0 with HTML to match elements that aren't in a parent

Consider this HTML:

<div>
  <table>
    <tr>
      <td>
        <a class="cal-date">1</a>
        <div class="checkin-time">6 AM | 8h 30m</div>
      </td>
    </tr>
  </table>
</div>

I would like to use XPath 1.0 to return 6 AM | 8h 30m while matching the class (cal-date) and text contents (1) in <a class="cal-date">1</a>. The <a> isn't a parent or anything, so I'm a little lost.

How is this done?

Upvotes: 1

Views: 75

Answers (2)

matt
matt

Reputation: 79723

XPath has the concept of axes (that’s plural of axis, not things for cutting down trees). The default axis is the child:: axis, so if you don’t specify it your query will be searching the children of the previous node. You can create more complex queries by using different axes.

In this case you probably want to use the following-sibling:: axis. First select a element as usual, then in the next location step of your query specify the following-sibling:: axis to search the siblings of the a node rather than its children:

//a[@class='cal-date' and . = '1']/following-sibling::div

If you need to you can be more specific with the div query, as with “normal” XPath, and can continue the query after the change of axis. For example if your HTML was more complex and looked like this:

<a class="cal-date">1</a>
<div>A decoy div</div>
<div>
  <span>Not this</span>
  <span class="checkin-time">6 AM | 8h 30m</span>
  <span> Not this either</span>
</div>

you could get at the checkin-time span with an XPath expression like this:

//a[@class='cal-date' and . = '1']/following-sibling::div[2]/span[@class='checkin-time']

Note that when selecting the span element, after the following-sibling::div part, the axis isn’t specified so it uses the default of child::, because we are looking for children of the div.

Upvotes: 1

Jens Erat
Jens Erat

Reputation: 38662

There's no need to use following-sibling for this. Alternatively, search for <div/> elements contained by table cells which contain the link you're looking for.

//td[a[@class='cal-date' and . = '1']]/div

Upvotes: 0

Related Questions