Oscar Hierro
Oscar Hierro

Reputation: 1107

XPath: select a node without fetching its children

How can I use XPath to select a node without retrieving all of its child nodes? For instance, in the following XML document:

<parentnode>
  <node1 a="b" b="c">
    <child1/>
    <child2/>
    ... many many child nodes
    <childN/>
  </node1>
  <node2/>
</parentnode>

I'd like to be able to select the 'node1' element in order to inspect its attributes, but without selecting the child nodes, which I don't need to parse and could be thousands of elements, thus impacting the performance of the query (the output of which is used to build a sort of DOM tree with arrays and dictionaries in a 3rd party library).

Update: just to be clearer, the 3rd party library I mentioned is actually just an Objective-C wrapper around the libxml2 parser that builds a DOM tree made of Foundation classes with the result of any XPath query. The queries themselves are executed over an already parsed document (xmlDocPtr) that is reused for all queries, so yes, as many answers say, the document is already DOM'ed up at C level, but the Objective-C wrapper implementation produces the performance hit in this particular scenario. I could modify this library to optionally not fetch the selected node's children, but I thought there would probably be a simple way to retrieve just the node's attributes with the query.

Upvotes: 4

Views: 6018

Answers (4)

Michael Kay
Michael Kay

Reputation: 163262

An XPath expression like /a/b/c will select the c elements: it does not select their children. The reason many people imagine that it also selects the children is that many tools will show the result of the XPath expression by showing you the entire subtree rooted at the c element. One can understand why they do that - it shows you visually what you've selected - but the XPath expression itself is just returning a pointer to the selected element, and where you go from there is entirely up to you. (Some tools, rather than showing you the subtree rooted at the element, show the path to the node with all its ancestors - that's equally valid.)

Upvotes: 5

Matthew Lund
Matthew Lund

Reputation: 4012

Well, if the whole thing is already DOM'd up then you're doing no further DOMing by XPATH selecting node1. At that point the fact that node1 has children is irrelevant to performance.

However, if we're to assume that the whole thing is not DOM'd up then we're probably talking about a forward-only reader. There are some forward-only readers that can do the XPATHing you need.

Upvotes: 0

snoofkin
snoofkin

Reputation: 8895

use @ to get attributes, for instance:

  • /parentnode/node1/@a - will fetch the "b" value
  • /parentnode/node1/@b - will fetch the "c" value

Upvotes: 0

Steven D. Majewski
Steven D. Majewski

Reputation: 2157

If you just want the attributes, then just select the attributes: /parentnode/node1/@*

But (as noted in another answer) and Xpath processor still has to parse the whole file. You won't be saving much.

If you only want to parse part of the file and then stop after you've gotten the info you need, you should probably use SAX or some other API that gives you lower level control of the parsing.

Upvotes: 2

Related Questions