ade19
ade19

Reputation: 1200

XPATH: Select text after node

<div class="container-body">

    <div class="rule"><hr></div>
    <h3>Software version:</h3>
    10.0.0

    <div class="rule"><hr></div>
    <h3>Operating system(s):</h3>
    AIX, Linux, Windows

    <div class="rule"><hr></div>
    <h3>Reference #:</h3>
7042947

<div class="rule"><hr></div>
<h3>Modified date:</h3>
<p>2015-04-02</p>

</div>

Given the above code segment, How do I get the values 10.0.0; AIX,Linx,Windows; and 7042947 considering that they are not within any HTML tags.

Upvotes: 2

Views: 3520

Answers (3)

Abel
Abel

Reputation: 57149

As often, the answer is: "it depends". If you just need the non-whitespace text nodes within <div>, you can use the following, but it will select any child under <div> that is a text node (but not grand-children).

div/text()[normalize-space()]

If you only want the text nodes following <div class="rule">... and <h3> explicitly, you can instruct XPath to do so:

div
    /div[@class="rule"]
    /following-sibling::*[1]
    /self::h3
    /following-sibling::text()[1]

Which means:

  • select <div>
  • select every child <div> with attribute class="rule"
  • select the first following sibling element
  • only select this following sibling element if it is h3
  • then (if all previous succeed) select the first following text node

Or if you want to select any non-whitespace text node in the whole document that is preceded by a <h3> you can do the following:

//text()[normalize-space()][preceding-sibling::*[1]/self::h3]

This last expression is specifically crafted to ignore any comment nodes or PI instructions and only select the text node if its immediate preceding sibling element is <h3>, otherwise it will ignore it.

Hopefully the above examples give you enough tools to construct your XPath, but if your requirement isn't in there and you can't figure it out, just ask.

Upvotes: 3

Alexander Petrov
Alexander Petrov

Reputation: 14231

XPath may be simple as:

"*/text()"

or as:

"*/text()[normalize-space()]"

Depends on the library.

Upvotes: 1

Mona
Mona

Reputation: 352

To get AIX, Linux, Windows

use the following xpath,

//h3[2]/following-sibling::text()[1]

similarly create other xpaths to get your string.

Upvotes: 0

Related Questions