ylerjen
ylerjen

Reputation: 4259

xslt optimisation: access child multiple time or use variable

I need an information to optimize my xslt.

In my template I access a child multiple times like for example:

<xsl:template match="user">
 <h1><xsl:value-of select="address/country"/></h1>
 <p><xsl:value-of select="address/country"/></p>
 <p><xsl:value-of select="address/country"/></p>
  ... more and more...
 <p><xsl:value-of select="address/country"/></p>
</xsl:template>

Would it be better to store the content of the child element in a variable and directly call the variable to avoid to parse the tree everytime:

<xsl:template match="user">
 <xsl:variable name="country" select="address/country"/>
 <h1><xsl:value-of select="$country"/></h1>
 <p><xsl:value-of select="$country"/></p>
 <p><xsl:value-of select="$country"/></p>
  ... more and more...
 <p><xsl:value-of select="$country"/></p>
</xsl:template>

Or will the use of a variable consume more resources than parsing the tree multiple times?

Upvotes: 4

Views: 1463

Answers (3)

Mathias M&#252;ller
Mathias M&#252;ller

Reputation: 22617

Usually, an XML file is parsed as a whole and held in memory as XDM. So, I guess that by

than parsing the tree multiple times

you actually meant accessing the internal representation of the XML input multiple times. The figure below illustrates this, we are talking about the source tree:

enter image description here
(taken from Michael Kay's XSLT 2.0 and XPath 2.0 Programmer's Reference, page 43)

Likewise, xsl:variable creates a node (or, more precisely, a temporary document) that is held in memory and that needs to be accessed, too.

Now, what exactly do you mean by optimisation? Do you mean the time it takes to perform the transformation or CPU and memory usage (as you mention "resources" in your question)?

Also, performance depends on the implementation of your XSLT processor of course. The only reliable way of finding out is to actually test this.

Write two stylesheets that differ only in this regard, that is, are identical otherwise. Then, let both of them transform the same input XML and measure the time they take.

My guess is that accessing a variable is faster and it is also more convenient to repeat a variable name than repeating full paths as you write code (this is sometimes called "convenience variables").


EDIT: Replaced with something more appropriate, as a response to your comment.

If you actually test this, write two stylesheets:

Stylesheet with variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <xsl:variable name="var" select="node/subnode"/>
         <subnode nr="1">
            <xsl:value-of select="$var"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="$var"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Stylesheet without variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <subnode nr="1">
            <xsl:value-of select="node/subnode"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="node/subnode"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Applied to the following input XML:

<root>
   <node>
      <subnode>helloworld</subnode>
   </node>
</root>

EDIT: As suggested by @Michael Kay, I measured the average time taken in 100 runs ("-t and -repeat:100 on the Saxon command line"):

with variable: 9 ms
without variable: 9 ms

This does not imply that the result is the same with your XSLT processor.

Upvotes: 4

Michael Kay
Michael Kay

Reputation: 163458

For all performance questions, the answer is: it depends.

  • It depends what XSLT processor you are using, and on the optimizations it performs.

  • It's very likely to depend on how many children have to be searched to find the ones you are looking for.

The only way to find out is to measure it, and to measure it very carefully.

Personally, I would use a variable if there is a complex predicate involved, but not if I'm just looking for children by name.

In nearly all cases, even if it makes a difference, it is very unlikely to make a difference to the bottom line of your business. If you are interested in improving the bottom line of your business, there are probably better ways to employ your intellect.

Upvotes: 2

J. Katzwinkel
J. Katzwinkel

Reputation: 1943

Edit: Having been invited to re-evaluate my answer, I learned that your own suggestion is probably quite suitable for what you are going for. Unless you encapsulate a variable's selection value in additional single quotes [to make it a string constant], it will contain the selected element. [Instead of inserting said element's text contents, you can even copy the selected element's entire sub-tree by using <xsl:copy-of select="$country"/> if you desire so.]

For even less repetitive source, why not applying an own template for the element in question:

<xsl:apply-template select="address/country"/>
[...]
<xsl:template match="address/country">
   <h1><xsl:value-of select="."/></h1>
   <p><xsl:value-of select="."/></p>
   [...]
</xsl:template>

Like @Mathias_Müller suggested, there are also ways to express your '...more and more...' behaviour without having to copy'n'paste concerned lines over and over. XSLT 2.0 interprets numerical ranges in the for-each statement:

<xsl:for-each select="1 to 100">
  <p><xsl:value-of select="."/></p>
</xsl:for-each>

If XSLT is not available in a version >= 2.0, a slightly more complex solution is to conditionally call templates explicitly using call-template while passing parameters and implementing a divide-and-conquer approach [to protect the stack]:

<xsl:call-template name="ntimes">
  <xsl:with-param name="counter" select="100"/>
</xsl:call-template>
[...]
<xsl:template name="ntimes">
  <xsl:param name="counter" select="0"/>
  <xsl:if test="$counter > 0">
    <xsl:choose>
      <xsl:when test="$counter = 1">
        <xsl:apply-template select="address/country"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:variable name="half" select="floor($counter div 2)"/>
        <xsl:call-template name="ntimes">
          <xsl:with-param name="counter" select="$half"/>
        </xsl:call-template>
        <xsl:call-template name="ntimes">
          <xsl:with-param name="counter" select="$counter - $half"/>
        </xsl:call-template>
       [...]

Go here and here for explanation.

To be honest, I know nothing about performance and optimization in XSLT. I never considered it worth the effort, given that most of the time I use XSLT processors written in Java, and of what use is it to have great input files, while theres is still an entire, several hundred MB of RAM consuming JVM to start up..?

Upvotes: 1

Related Questions