jdk
jdk

Reputation: 159

XSLT performance of fn:matches() versus =

Curious as to the difference in performance between equality and fn:matches(), I ran the following test:

<xsl:variable name="limit" select="123456789"/>
<xsl:template match="/*">
    <xsl:copy>
        <xsl:value-of
            select="
                count(for $i in (1 to $limit)
                return
                    if ('m' = 'm') then
                        true()
                    else
                        ())"/>
        <xsl:value-of
            select="
                count(for $i in (1 to $limit)
                return
                    if (matches('m', 'm')) then
                        true()
                    else
                        ())"
        />
    </xsl:copy>
</xsl:template>

Processed via Saxon HE/PE 9.7.0.15, both <value-of>s run in 6.9 seconds on my Mac. Alone, the first runs 5.2 s and the second 1.8 s. This seems unintuitive to me. Why would equality take longer to evaluate than matching?

Is this difference comparably true in all cases? That is, would choosing fn:matches() over = (obviously, for string comparison) improve performance generally?

Update: Under Saxon EE the tests sped up and balanced out: 5.1/2.5/2.5 seconds respectively. It still leaves the original question of why the operation that would seem, prima facie, to be the simpler would take as long or longer than the more sophisticated one.

Upvotes: 1

Views: 154

Answers (1)

Michael Kay
Michael Kay

Reputation: 163587

I suspect that your measurements are flawed: the usual mistake is to fail to allow for Java warm-up time, so in effect you are just measuring how long it takes Java to load. Try using the command line option -repeat:50 to see the effect of this.

The other problem is that both your "=" and "matches" calls have constant arguments, which means the expression will be evaluated once, at compile time.

The execution plan that Saxon actually generates is like this (it's identical for both value-of instructions):

<fn name="count">
 <for var="i" as="xs:integer" slot="0">
  <range role="in" from="1" to="123456789"/>
  <true role="return"/>
 </for>
</fn>

That is, it has reduced both expressions to

count(for $i in 1 to 123456789 return true())

(which explains why both are taking the same time to execute) and it would be a very short step to optimize this further to

123456789

Upvotes: 2

Related Questions