Reputation: 241
I used XPath to parse rss xml data, and the data is
<rss version="2.0">
<channel>
<title>
<![CDATA[sports news]]>
</title>
</channel>
</rss>
I want to get the text "sports news" using xpath "/rss/channel/title/text()" ,but the result is not what I want ,the real result is "\r\n",so how to found the result I want.
the code is below:
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is); XPathFactory xpathFactory = XPathFactory.newInstance(); XPath xPath = xpathFactory.newXPath(); Node node = (Node) xPath.evaluate("/rss/channel/title/text()", doc,XPathConstants.NODE); String title = node.getNodeValue();
Upvotes: 6
Views: 6598
Reputation: 23268
Try calling setCoalescing(true) on your DocumentBuilderFactory and this will collapse all CDATA/text nodes into single nodes.
Upvotes: 4
Reputation: 28004
You could try changing the XPath expression to
"string(/rss/channel/title)"
and use return type STRING instead of NODE:
Node node = (Node) xPath.evaluate("string(/rss/channel/title)", doc,
XPathConstants.STRING);
This way you are not selecting a text node, but rather the string value of the title element, which consists of the concatenation of all its descendant text nodes.
Upvotes: 0