Florian S.
Florian S.

Reputation: 446

Get sibling and following text of parent element with XPath

I have the following partial HTML from which I want to get everything after the last < br > using - means the link element and TEXT5 in this example.

<td>
  <span>
    <span>TEXT1</span>
  </span>
  <br>
  TEXT2
  <span>TEXT3</span>
  <br>
  <a href=...>TEXT4</a>
  TEXT5
<td>

It is relatively easy to get the link element with

td/br[last()]/following-sibling::*

but is there a way to get TEXT5 as well?

Upvotes: 1

Views: 757

Answers (1)

har07
har07

Reputation: 89285

As you observed, * only return elements while you need to return both element node and text node here. You can do so using node() instead, which will return any kind of node :

td/br[last()]/following-sibling::node()

It is also possible to be more specific if you want, for example, you can add predicate to restrict the node type to be either a element or text node :

td/br[last()]/following-sibling::node()[self::a|self::text()]

Despite the XPath expression itself works, it is possible though, that your Java API doesn't support returning mixed type of nodes, I don't know.

Upvotes: 1

Related Questions