Reputation: 24998
I am self-studying XPath from Pro XML Development with Java. Just for practice I have constructed a sample XML document and some XPath expressions.
Below are a few XPath expressions along with their explanations and a few related questions. Please correct me if my explanations are wrong and answer the questions wherever applicable.
<?xml version="1.0" encoding="UTF-8" ?>
<people>
<student scholarship="Yes">
<name>John</name>
<course>Computer Technology</course>
<semester>6</semester>
<scheme>E</scheme>
</student>
<student>
<name>Foo</name>
<course>Industrial Electronics</course>
<semester>6</semester>
<scheme>E</scheme>
</student>
<grumpy-cat>
<soup-noodle>
<student>
<name>Dingle</name>
<course>Grumpiness</course>
<semester>3</semester>
<scheme>E</scheme>
</student>
</soup-noodle>
</grumpy-cat>
</people>
Expression 1: /people/student[@scholarship='Yes']/name
Explanation: Will select the elements <name>..</name>
which are contained in <people>
such that <student>
has an attribute named scholarship
with a value of Yes
Question: Will this also select the value John in it ????
Expression 2: /people/student[2]
Explanation: Will select the element <student>..</student>
which is at the 2nd position in the element <people>
Question: Will it also select the child nodes within ?
Expression 3: /people/student/@scholarship
Explanation: Will select the attribute scholarship in the element student. If there were multiple <student scholarship="">
then it would select multiple attributes
Expression 4: //name[ancestor::student]
Explanation: Will select all the <name>..</name>
elements
//
means 'all-the-descendants'. In my context it means 'I don't care who the descendants are
as long as my immediate ancestor is student'
Upvotes: 1
Views: 237
Reputation: 243469
Expression 1:
/people/student[@scholarship='Yes']/name
Explanation: Will select the elements .. which are contained in such that has an attribute named scholarship with a value of Yes Question: Will this also select the value John in it ????
This expression selects any (all) name
element that is a child of a student
element (whose scholarship
attribute has as string value the string "yes")and that is a child of the top element (named people
) of the XML document. XPath doesn't select "values" -- it selects nodes. In this case the string "John" is the string value of the selected name
element. The selected name
element has a single child text node, whose string value is "John".
Expression 2: /people/student[2] Explanation: Will select the element .. which is at the 2nd position in the element Question: Will it also select the child nodes within ?
This selects the second (in document order) student
child of the top element (whose name must be people
). The child nodes of the selected element are not selected themselves. The number of selected nodes can be obtained using the count()
function:
count(/people/student[2])
and it is 1
-- this means that only the element (but not its children or descendants) is selected.
Expression 3: /people/student/@scholarship Explanation: Will select the attribute scholarship in the element student. If there were multiple then it would select multiple attributes
This selects the scholarship
attribute of any student
element that is a child of the top element (whose name must be people
). This means that if there are N student
elements that are children of the people
top element, and if each of these has a scholarship
attribute, then N scholarship attributes will be selected.
Expression 4: //name[ancestor::student] Explanation: Will select all the .. elements // means 'all-the-descendants'. In my context it means 'I don't care who the descendants are as long as my immediate ancestor is student'
This selects all name
elements that have a student
ancestor (and this ancestor may not only be the immediate parent, but also an ancestor of the immediate parent).
Here one can write an equivalent XPath expression that doesn't contain any reverse axes:
//student//name
In case you wanted to select all name
elements whose parent is a student
element, one way to express this is:
//student/name
Finally, I would recommend using a tool like the XPath Visualizer (which I created 12 years ago) that has helped many thousands of people learn XPath by playing and having fun.
Upvotes: 2
Reputation: 167471
All your four XPath expressions select nodes in the input tree, if you use XPath 1.0 such XPath expressions return a set of nodes (where the set can be empty or contain one or more nodes of the input tree), if you use XPath 2.0 such expressions return a sequence of nodes (which again can be empty or can contain one or more nodes of the input tree).
name
element node in the given input tree, this node contains a single text node with the value John
. student
element node in the input tree, that student
element node has several child nodes (and XPath selection does simply select a node in the input tree, it does not modify anything or create new nodes).scholarship
attribute node, you are right that it would select several such nodes if the input XML contained several student
element nodes with scholarship
attributes.//name[ancestor::student]
is a short form (see http://www.w3.org/TR/xpath/#path-abbrev) of /descendant-or-self::node()/name[ancestor::student]
which is a short form of /descendant-or-self::node()/child::name[ancestor::student]
. So it selects all name
child elements of the root node as well as of all descendant nodes of the root node, where the name
elements have a student
ancestor element node. Your explanation of that expression is wrong, both the part about all the descendants
(well this is at least imprecise) as well as the my immediate ancestor is student
. The immediate ancestor is the parent, expressed simply as parent::student
in XPath while your ancestor::student
looks up all levels of ancestors. And all the descendants is /descendant::name
. On the other hand with the way //
is defined and your next step name
the //name
boils down to the same as /descendant::name
. Upvotes: 2