Paulb
Paulb

Reputation: 1531

XSLT User Defined Function

I am new to XSLT 2.0. I am intrigued by User Defined functions ( <xsl:function ). In particular, I'd like to use a UDF to make the code more modular and readable.

I have this xsl:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
   version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
    <xsl:variable name="stopwords"
        select="document('stopwords.xml')//w/string()"/>
             <wordcount>
                <xsl:for-each-group group-by="." select="
                    for $w in //text()/tokenize(., '\W+')[not(.=$stopwords)] return $w">
                    <xsl:sort select="count(current-group())" order="descending"/>            
                    <word word="{current-grouping-key()}" frequency="{count(current-group())}"/>
                </xsl:for-each-group>
             </wordcount>
</xsl:template>
</xsl:stylesheet>

Can want to add more condition testing (for example, exclude digits) to the for $w in //text()/tokenize(., '\W+')[not(.=$stopwords)] but the code would get messy.

Is a UDF an option to tidy up that section of code if I make it more complex. Is it good practice to do so?

Upvotes: 10

Views: 28027

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167401

Well you could write a function to be used in the predicate

<xsl:function name="mf:check" as="xs:boolean">
  <xsl:param name="input" as="xs:string"/>
  <xsl:sequence select="not($input = $stopwords) and not(matches($input, '^[0-9]+$'))"/>
</xsl:function>

and use it in your code e.g.

<xsl:stylesheet
   version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   xmlns:mf="http://example.com/mf"
   exclude-result-prefixes="mf xs">
<xsl:output method="xml" indent="yes"/>

    <xsl:function name="mf:check" as="xs:boolean">
      <xsl:param name="input" as="xs:string"/>
      <xsl:sequence select="not($input = $stopwords) and not(matches($input, '^[0-9]+$'))"/>
    </xsl:function>

    <xsl:variable name="stopwords"
        select="document('stopwords.xml')//w/string()"/>

    <xsl:template match="/">
        <wordcount>
            <xsl:for-each-group group-by="." select="
                for $w in //text()/tokenize(., '\W+')[mf:check(.)] return $w">
                <xsl:sort select="count(current-group())" order="descending"/>            
                <word word="{current-grouping-key()}" frequency="{count(current-group())}"/>
            </xsl:for-each-group>
        </wordcount>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 16

Related Questions