Reputation: 1149
When using XPath or XQuery, is there a way to limit the depth of the result?
I am using BaseX, which supports XQuery 3.1 and XSLT 2.0.
For example, given this input document:
<country name="United States">
<state name="California">
<county name="Alameda" >
<city name="Alameda" />
<city name="Oakland" />
<city name="Piedmont" />
</county>
<county name="Los Angeles">
<city name="Los Angeles" />
<city name="Malibu" />
<city name="Burbank" />
</county>
<county name="Marin">
<city name="Fairfax" />
<city name="Larkspur" />
<city name="Ross" />
</county>
<county name="Sacramento">
<city name="Folsom" />
<city name="Elk Grove" />
<city name="Sacramento" />
</county>
</state>
</country>
If I execute this query: /country/state
, I get the following result:
<state name="California">
<county name="Alameda">
<city name="Alameda"/>
<city name="Oakland"/>
<city name="Piedmont"/>
</county>
<county name="Los Angeles">
<city name="Los Angeles"/>
<city name="Malibu"/>
<city name="Burbank"/>
</county>
<county name="Marin">
<city name="Fairfax"/>
<city name="Larkspur"/>
<city name="Ross"/>
</county>
<county name="Sacramento">
<city name="Folsom"/>
<city name="Elk Grove"/>
<city name="Sacramento"/>
</county>
</state>
I would like to limit the depth of the result. Ideally, there'd be a way for me to specify the depth, rather than hard-coding an XPath query.
As an example, I would like to limit the result to the result nodes and its children, but not including the grandchildren, so the result would be:
<state name="California">
<county name="Alameda" />
<county name="Los Angeles" />
<county name="Marin" />
<county name="Sacramento" />
</state>
Upvotes: 2
Views: 278
Reputation: 1149
@zx845's post got me on the right track. My ultimate goal was to limit the depth of the result, with the intent of getting a "summary" and the metadata I need to get deeper results if necessary.
BaseX has a function "db:node-id" which will return the internal node ID of any given node. There's another function, "db:open-id" which returns the node with a given ID.
Suppose this given input:
<country name="United States">
<state name="California">
<county name="Alameda">
<city name="Alameda"/>
<city name="Oakland"/>
<city name="Piedmont"/>
</county>
<county name="Los Angeles">
<city name="Los Angeles"/>
<city name="Malibu"/>
<city name="Burbank"/>
</county>
<county name="Marin">
<city name="Fairfax"/>
<city name="Larkspur"/>
<city name="Ross"/>
</county>
<county name="Sacramento">
<city name="Folsom"/>
<city name="Elk Grove"/>
<city name="Sacramento"/>
</county>
</state>
<state name="New York">
<county name="Albany">
<city name="Albany"/>
<city name="Cohoes"/>
<city name="Watervliet"/>
</county>
<county name="Erie">
<city name="Buffalo"/>
<city name="Lackawanna"/>
<city name="Tonawanda"/>
</county>
</state>
</country>
I defined this function, which lets me control the depth, and return the node-id for each node.
declare function local:abbreviated($input, $depth as xs:integer)
{
if($depth = 0) then
element node {
db:node-id($input)
}
else
element { node-name($input) } {
attribute node-id {
db:node-id($input)
},
$input/@*,
$input/text(),
for $child in $input/*
return local:abbreviated($child, $depth - 1)
}
};
If I execute the following:
declare variable $input := /country/state;
for $result in $input
return local:abbreviated($result, 1)
Then I get this result:
<state node-id="3" name="California">
<node>5</node>
<node>13</node>
<node>21</node>
<node>29</node>
</state>
<state node-id="37" name="New York">
<node>39</node>
<node>47</node>
</state>
Now, when I process the results, if the user wants more details for a state
element, I can process each 'node' element and execute this query to get the actual contents of the node
local:abbreviated(db:open-id('states', 5), 2)
Resulting in:
<county node-id="5" name="Alameda">
<city node-id="7" name="Alameda"/>
<city node-id="9" name="Oakland"/>
<city node-id="11" name="Piedmont"/>
</county>
Upvotes: 0
Reputation: 163595
Actually the result of your query is a single node, the state
node in the source document. Some software is then displaying the results of the query - that is, the state
node - in some particular format, but in principle the results could be displayed in a different way without changing the query. For example, I'm aware of software that would display the results of this query as
/country[1]/state[1]
So you need to separate two questions: what nodes does the query return, and how are they displayed? In some cases it might make sense to create a processing pipeline where the first step selects the nodes of interest, and the second step controls the presentation of the results.
Personally I would always do the second step in XSLT, but some people prefer XQuery. Take your pick.
Upvotes: 0
Reputation: 29052
One easy and straightforward way is to use XSLT-2.0 with an empty template cancelling all children of <county>
. The <xsl:strip-space>
removes the space that would have been used by the children.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:strip-space elements="*" />
<!-- Identity template -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates select="/country/state" />
</xsl:template>
<xsl:template match="county/*" />
</xsl:stylesheet>
Output is:
<?xml version="1.0" encoding="UTF-8"?>
<state name="California">
<county name="Alameda"/>
<county name="Los Angeles"/>
<county name="Marin"/>
<county name="Sacramento"/>
</state>
With XQuery, a solution could look like this:
for $st in doc("b.xml")/country/state return
element { node-name($st) } { $st/@*,
for $ct in $st/county return
element { node-name($ct) } { $ct/@* }
}
The output is the same.
Upvotes: 4