Reputation: 10156
I use elementpath to handle some XPath queries. I have an XML
with linear structure which contains a unique id
attribute.
<items>
<item id="1">...</item>
<item id="2">...</item>
<item id="3">...</item>
... 500k elements
<item id="500003">...</item>
</items>
I want the parser to find the first occurence without traversing all the nodes. For example, I want to select //items/item[@id = '3']
and stop after iterating over 3 nodes only (not over 500k of nodes). It would be a nice optimization for many cases.
Upvotes: 1
Views: 33
Reputation: 167716
An example using XSLT 3 streaming with a static parameter for the XPath, then using xsl:iterate
with xsl:break
to produce the "early exit" once the first item sought has been found would be
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all">
<xsl:param name="path" static="yes" as="xs:string" select="'items/item[@id = ''3'']'"/>
<xsl:output method="xml"/>
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:iterate _select="{$path}">
<xsl:if test="position() = 1">
<xsl:copy-of select="."/>
<xsl:break/>
</xsl:if>
</xsl:iterate>
</xsl:template>
</xsl:stylesheet>
You can run it with SaxonC EE (unfortunately streaming is only supported by EE) and Python with e.g.
import saxonc
with saxonc.PySaxonProcessor(license=True) as proc:
print("Test SaxonC on Python")
print(proc.version)
xslt30proc = proc.new_xslt30_processor()
xslt30proc.set_parameter('path', proc.make_string_value('/items/item[@id = "2"]'))
transformer = xslt30proc.compile_stylesheet(stylesheet_file='iterate-items-early-exit1.xsl')
xdm_result = transformer.apply_templates_returning_value(source_file='items-sample1.xml')
if transformer.exception_occurred:
print(transformer.error_message)
print(xdm_result)
Upvotes: 1