Derrick Omanwa
Derrick Omanwa

Reputation: 525

How to exclude a child node from xpath?

I have the following code :

<div class = "content">
  <table id="detailsTable">...</table>
  <div class = "desc">
     <p>Some text</p>
  </div>
  <p>Another text<p>
</div>

I want to select all the text within the 'content' class, which I would get using this xPath :

doc.xpath('string(//div[@class="content"])')

The problem is that it selects all the text including text within the 'table' tag. I need to exclude the 'table' from the xPath. How would I achieve that?

Upvotes: 0

Views: 325

Answers (2)

E.Wiest
E.Wiest

Reputation: 5915

XPath 1.0 solutions :

substring-after(string(//div[@class="content"]),string(//div[@class="content"]/table))

Or just use concat :

concat(//table/following::p[1]," ",//table/following::p[2])

Upvotes: 1

Michael Kay
Michael Kay

Reputation: 163595

The XPath expression //div[@class="content"] selects the div element - nothing more and nothing less - and applying the string() function gives you the string value of the element, which is the concatenation of all its descendant text nodes.

Getting all the text except for that containing in one particular child is probably not possible in XPath 1.0. With XPath 2.0 it can be done as

string-join(//div[@class="content"]/(node() except table)//text(), '')

But for this kind of manipulation, you're really in the realm of transformation rather than pure selection, so you're stretching the limits of what XPath is designed for.

Upvotes: 0

Related Questions