Zooly
Zooly

Reputation: 4787

How to get all descendant text content in XML using XPath

XML file

<TEXT>
    <DESCR>
         Here is the first part...
         <PLUS>The second</PLUS>
         And the third
    </DESCR>
</TEXT>

What I expect to get:

Here is the first part...The secondAnd the third

What I actually get:

Here is the first part...And the third.

I tried descendant-or-self::* xPath function, child, and descendant, no result.

If someone can tell me how to get the text in the child nodes too.

Upvotes: 3

Views: 2122

Answers (2)

kjhughes
kjhughes

Reputation: 111561

XPath 1.0

You cannot perform the concatenation of all text descendents of a given node within XPath 1.0. You can select the nodes in XPath,

/TEXT/DESCR//text()

but then you'll have to perform the concatenation in the hosting language.

In PHP:

$xml = '<TEXT>
    <DESCR>
         Here is the first part...
         <PLUS>The second</PLUS>
         And the third
    </DESCR>
</TEXT>';
$dom = new DOMDocument();
$dom->loadXML($xml);
$x= new DOMXpath($dom);
foreach($x->query("/TEXT/DESCR//text()") as $node) echo trim($node->textContent); 

Will output the result you requested:

Here is the first part...The secondAnd the third

[Alternatively if you've no other reason to iterate over the text nodes, replace the foreach loop above with:]

$xml = '<TEXT>
    <DESCR>
         Here is the first part...
         <PLUS>The second</PLUS>
         And the third
    </DESCR>
</TEXT>';
$dom = new DOMDocument();
$dom->loadXML($xml);
$x= new DOMXpath($dom);
echo str_replace(PHP_EOL, '', $x->evaluate('normalize-space(/TEXT/DESCR)'));

Which yields:

Here is the first part... The second And the third

XPath 2.0

You can perform the concatenation of all text descendents of a given node within XPath 2.0:

string-join(/TEXT/DESCR//text(), '')

Upvotes: 4

ErikL
ErikL

Reputation: 2041

If you can't change your input XML, this might work:

concat(/TEXT/DESCR,/TEXT/DESCR/PLUS)

or

string-join(/TEXT/DESCR/descendant-or-self::text())

Upvotes: 0

Related Questions