user1010434
user1010434

Reputation: 241

how to use XPath to find the node value with CDATA tag in java

I used XPath to parse rss xml data, and the data is

<rss version="2.0">
  <channel>
    <title>
      <![CDATA[sports news]]>
    </title>
  </channel>
</rss>  

I want to get the text "sports news" using xpath "/rss/channel/title/text()" ,but the result is not what I want ,the real result is "\r\n",so how to found the result I want.

the code is below:

    Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
    XPathFactory xpathFactory = XPathFactory.newInstance();
    XPath xPath = xpathFactory.newXPath();
    Node node = (Node) xPath.evaluate("/rss/channel/title/text()", doc,XPathConstants.NODE);
    String title = node.getNodeValue();

Upvotes: 6

Views: 6598

Answers (2)

prunge
prunge

Reputation: 23268

Try calling setCoalescing(true) on your DocumentBuilderFactory and this will collapse all CDATA/text nodes into single nodes.

Upvotes: 4

LarsH
LarsH

Reputation: 28004

You could try changing the XPath expression to

"string(/rss/channel/title)"

and use return type STRING instead of NODE:

Node node = (Node) xPath.evaluate("string(/rss/channel/title)", doc,
                                  XPathConstants.STRING);

This way you are not selecting a text node, but rather the string value of the title element, which consists of the concatenation of all its descendant text nodes.

Upvotes: 0

Related Questions