Return full text element (including child/descendant elements)

Question

I'm trying to get the text from the first occurrence on the page of div/p, and only the first p. The

, even between embedded tags?

puts doc.xpath('html/body/div/p[1]/text()').first

Dimitre Novatchev · Accepted Answer

Use:

string((//div/p)[1])

When this XPath expression is evaluated the result is the string value of the first p in the document that is a child of a div.

By definition the string value of an element is the concatenation (in document order) of all of its text-node descendents.

Therefore, you get exactly all the text in the subtree rooted by this p element, with any other nodes (elements, comments, PIs) skipped.

XSLT - based verification:

When this transformation is applied on the following XML document (no such provided!):


 Hello 
  XML
   World!

the result of the evaluated XPath expression is output:

 Hello XML
   World!

Answers (2)