Reputation: 161
How do I retrieved the text inside article-field1?
<title>Testing</title>
<link>http://example.org</link>
<description>Description</description>
<language>en-us</language>
<lastBuildDate>Mon, 13 Feb 2012 00:00:00 +0000</lastBuildDate>
<item>
<title>Title Here</title>
<link>http://example.org/2012/03/27/</link>
<description><![CDATA[
<div id="article-field1"><a href="http://example.org/test1">Test 1</a></div>
<div id="article-field2">123</div>
<pubDate>Tue, 2 Mar 2012 00:00:00 +0000</pubDate>
</item>
I've tried to use
//description/div[@id="article-field1"]/text()
Any advise?
Thanks
Upvotes: 15
Views: 30606
Reputation: 13841
//description/div[@id="article-field1"]/a/text()
If the malformed CDATA
tag is removed, a root element is added and the corresponding 'description' tag is closed. This assumes an error of partially pasting the original XML, which is all that makes sense given the expression. Basically, the original query was missing the a
element.
This can be verified at http://www.xpathtester.com/.
Upvotes: 6
Reputation: 4249
From what I see your data are in a CDATA tag. This prevents parsing its content.
See How do I retrieve element text inside CDATA markup via XPath? for more details.
Upvotes: 3
Reputation: 12729
You can't do it with a single call of plain-vanilla XPATH processor.
You have two choices:
Upvotes: 2