Reputation: 3060
How can I use recursive AND conditional selection in XPath?
For example, given this document:
<root xmlns:foo="http://www.foo.org/" xmlns:bar="http://www.bar.org">
<file name="foo.mp4">
<chunks>
<file>
<chunks>
<file>
<chunks>
<file>1</file>
<file>2</file>
<file>3</file>
<file>4</file>
</chunks>
</file>
<file>
<chunks>
<file>5</file>
<file>6</file>
<file>7</file>
<file>8</file>
</chunks>
</file>
</chunks>
</file>
<file>
<chunks>
<file>
<chunks>
<file>9</file>
<file>10</file>
<file>11</file>
<file>12</file>
</chunks>
</file>
<file>
<chunks>
<file>13</file>
<file>14</file>
<file>15</file>
<file>16</file>
</chunks>
</file>
</chunks>
</file>
</chunks>
</file>
</root>
I would like to select just:
<file>1</file>
<file>2</file>
<file>3</file>
<file>4</file>
So, effectively this:
//[name="foo.mp4"]/chunks/*[1]/chunks/*[1]/*
But with a generalized approach -- i.e something that would cover even deeper-nested objects. Something like this:
//[name="foo.mp4"]/(chunks/*[1]/)+/*
(cond)+
is not XPath syntax, and a regex-like representation of what I want.
Upvotes: 3
Views: 1421
Reputation: 111726
Recursion implies self-reference and is not directly available in XPath. The usual way to ignore intervening levels of elements is via the descendant-or-self
axis (//
), anchored by a desired property.
For example, each of the following XPath expressions,
All file
elements with values less than 5:
//file[number() < 5]
The first 4 leaf file
elements:
//file[not(*)][count(preceding::file[not(*)]) < 4]
The file
leaf elements whose ancestors have no predecessors:
//file[not(*)][not(ancestor::*[preceding::*])]
will select
<file>1</file>
<file>2</file>
<file>3</file>
<file>4</file>
as requested.
Upvotes: 5
Reputation: 89325
There is no such thing as recursive XPath as far as I know. So you'll need to combine XPath with some other things like XSLT or a programming language to be able to do recursion. Using pure XPath, you'll need to formulate the requirement differently, if possible.
I don't know if this is applicable to your actual data, but if you can formulate the requirement to something like the following, for example :
"within
file[@name='foo.mp4']
, find the first<chunk>
that contains leaf<file>
i.e<file>
element that doesn't contain any element, only text nodes, and return the leaf<file>
elements"
then there will be a possible pure XPath solution :
(//file[@name='foo.mp4']//chunks[not(file/*)])[1]/file
given sample XML in question, the expected output of file
1 to 4 are returned by the above XPath expression when tested here
.
Upvotes: 3