Reputation: 12885

Only select text directly in node, not in child nodes

How does one retrieve the text in a node without selecting the text in the children?

<div id="comment">
     <div class="title">Editor's Description</div>
     <div class="changed">Last updated: </div>
     <br class="clear">
     Lorem ipsum dolor sit amet.
</div>

In other words, I want Lorem ipsum dolor sit amet. rather than Editor's DescriptionLast updated: Lorem ipsum dolor sit amet.

Upvotes: 50

Answers (3)

bosari

Reputation: 2010

How about this :
$doc/node()[3]/text()
Assuming $doc has the xml.

Upvotes: 1

Dimitre Novatchev

Reputation: 243579

In the provided XML document:

<div id="comment">
      <div class="title">Editor's Description</div>
      <div class="changed">Last updated: </div>
      <br class="clear">
      Lorem ipsum dolor sit amet. 
</div>

the top element /div has 4 children nodes that are text nodes. The first three of these four text-node children are whitespace-only. The last of these 4 text-node children is the one that is wanted.

Use:

/div/text()[last()]

This is different from:

/div/text()

The latter may (depending on whether whitespace-only nodes are preserved by the XML parser) select all 4 text nodes, but you only want the last of them.

An alternative is (when you don't know exactly which text-node you want):

/div/text()[normalize-space()]

This selects all text-node-children of /div that are not whitespace-only text nodes.

Upvotes: 53

Lucero

Reputation: 60276

Just select text() instead of .:

div/text()

On the given XML fragment, this returns:

Lorem ipsum dolor sit amet.

Upvotes: 18

Only select text directly in node, not in child nodes

Answers (3)

Related Questions