Reputation: 2228
This is the format of my XML data:
<?xml version="1.0" encoding="utf-8"?>
<rowdata>
<row Id="1" type="1" data="text" ... />
<row Id="2" type="2" data="text" parent="1" ... />
<row Id="3" type="1" data="text" ... />
<row Id="4" type="1" data="text" ... />
<row Id="5" type="2" data="text" parent="4" ... />
...
And this is my XSL sheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1"/>
<xsl:strip-space elements="*" />
<xsl:template match="/rowdata">
<xsl:for-each select="row">
<xsl:if test="@Id = 10000">
<xsl:value-of select="@data"/><xsl:text>
</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Facts:
Problem:
xsltproc input.xls input.xml
is very slow. Execution takes about 10 seconds for a single run (and many need to be made)Already tried:
Three questions:
Upvotes: 0
Views: 996
Reputation: 163458
What are you including in your 10 seconds? Does this include compiling the stylesheet and/or parsing/loading the source document, or is it purely the XSLT execution time?
I would expect that building an in-memory tree representation of your 900Mb input file is what is taking most of the time (10 seconds would be pretty fast for that operation). If you need to run the stylesheet many times, then the best way of improving performance will be to only build the source tree once and re-use it. But you then won't be able to run directly from the command line.
In principle you can speed up this kind of stylesheet by using keys:
<xsl:key name="k" match="row" use="@Id"/>
<xsl:template match="/rowdata">
<xsl:value-of select="key('k', 10000)/@data"/>
</xsl:template>
However, that's only going to work if you can ensure that the key index is only built once, and is then used repeatedly. At this stage I can't tell you how this might work in xsltproc, because it's all getting processor-specific.
You can terminate the search after the first hit simply by adding the predicate [1]
. But you're looking for bigger gains than that.
Upvotes: 1
Reputation: 117073
Assuming there can be only one row where Id
is 1000, you could do simply:
<xsl:template match="/rowdata">
<xsl:value-of select="row[@Id=1000]/@data"/>
</xsl:template>
I don't know if this will "dramatically increase the speed of the command".
Upvotes: 0