Reputation: 446
I have the following partial HTML from which I want to get everything after the last < br > using java- means the link element and TEXT5 in this example.
<td>
<span>
<span>TEXT1</span>
</span>
<br>
TEXT2
<span>TEXT3</span>
<br>
<a href=...>TEXT4</a>
TEXT5
<td>
It is relatively easy to get the link element with
td/br[last()]/following-sibling::*
but is there a way to get TEXT5 as well?
Upvotes: 1
Views: 757
Reputation: 89285
As you observed, *
only return elements while you need to return both element node and text node here. You can do so using node()
instead, which will return any kind of node :
td/br[last()]/following-sibling::node()
It is also possible to be more specific if you want, for example, you can add predicate to restrict the node type to be either a
element or text node :
td/br[last()]/following-sibling::node()[self::a|self::text()]
Despite the XPath expression itself works, it is possible though, that your Java API doesn't support returning mixed type of nodes, I don't know.
Upvotes: 1