shadow
shadow

Reputation: 161

retrieve xpath content from div id

How do I retrieved the text inside article-field1?

<title>Testing</title>
  <link>http://example.org</link>
  <description>Description</description>
  <language>en-us</language>
  <lastBuildDate>Mon, 13 Feb 2012 00:00:00 +0000</lastBuildDate>

  <item>
    <title>Title Here</title>
    <link>http://example.org/2012/03/27/</link>
    <description><![CDATA[
        <div id="article-field1"><a href="http://example.org/test1">Test 1</a></div>
        <div id="article-field2">123</div>
    <pubDate>Tue, 2 Mar 2012 00:00:00 +0000</pubDate>
  </item>

I've tried to use

//description/div[@id="article-field1"]/text()

Any advise?

Thanks

Upvotes: 15

Views: 30606

Answers (3)

ingyhere
ingyhere

Reputation: 13841

//description/div[@id="article-field1"]/a/text() 

If the malformed CDATA tag is removed, a root element is added and the corresponding 'description' tag is closed. This assumes an error of partially pasting the original XML, which is all that makes sense given the expression. Basically, the original query was missing the a element.

This can be verified at http://www.xpathtester.com/.

Upvotes: 6

Olivier.Roger
Olivier.Roger

Reputation: 4249

From what I see your data are in a CDATA tag. This prevents parsing its content.

See How do I retrieve element text inside CDATA markup via XPath? for more details.

Upvotes: 3

Sean B. Durkin
Sean B. Durkin

Reputation: 12729

You can't do it with a single call of plain-vanilla XPATH processor.

You have two choices:

  1. Uses a specific XPATH processor that implements the dyn:evaluate() function (and this begs the question: What processor and version are you using?); OR
  2. Use two calls. The first go get the text value of the /title/item/description node. The second, after loading the result of the first as a new XML document (with a few tweeks to convert the xml fragment into a proper xml document), is div[@id="article-field1"] .

Upvotes: 2

Related Questions