Reputation: 25631
Please consider this sample file: http://www.w3schools.com/dom/books.xml
This XPath expression //title/text()
, returns:
Everyday Italian
Harry Potter
XQuery Kick Start
Learning XML
Now I want just the first names, and try: tokenize(//title/text(),' ')[1]
, which returns:
Too many items
OTOH tokenize((//title/text())[1],' ')[1]
returns first name for first node.
How can I get substrings with XPath while iterating nodes?
Upvotes: 3
Views: 223
Reputation: 243579
Use:
//text()/tokenize(.,' ')[1]
This produces a sequence of the first "word" of every text node in the XML document.
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select="//text()/tokenize(.,' ')[1]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<t>
<a>Everyday Italian</a>
<b>Harry Potter</b>
<c>XQuery Kick Start</c>
<d>Learning XML</d>
</t>
the XPath expression is evaluated and the result of this evaluation is copied to the output:
Everyday
Harry
XQuery
Learning
The above includes a few white-space only text nodes.
If you want to ignore any whitespace-only text node, change the XPath expression to:
//text()[normalize-space()]/tokenize(.,' ')[1]
Upvotes: 2
Reputation: 11181
Try this
1. To get all parts except last one use this:
//title/string-join(tokenize(.,'\s+')[position() ne last()],' ')
or
2. To get only first one use this:
//title/string-join(tokenize(.,'\s+')[position() eq 1],' ')
Hope this helps.
Upvotes: 1