Alexander S.
Alexander S.

Reputation: 2279

Count the frequency of a word contained in a string in XSLT

How to count the frequency of a word contained in a string? I have to use XSLT 1.0

Example XML:

<a>
   <b>Can you can a can as a canner can can a can?</b>
</a>

So the word "can" is presented six times in this string? Can I count can? xD

I used something like this but get only "1"

<xsl:value-of select ="count(a/b[contains(.,'can')])" />

Additional Question: How to count "can" and "Can" but not a "canner" ?

Upvotes: 0

Views: 419

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 117083

Here is an example you could use as your starting point:

XML

<root>
    <string>Can you can a can as a canner can can a can?</string>
</root>

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:variable name="upper-case" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="lower-case" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="punctuation" select="'.,:;!?'"/>

<xsl:template match="/root">
    <results>
        <xsl:for-each select="string">
            <count>
                <xsl:call-template name="count-word-occurrences">
                    <xsl:with-param name="text" select="translate(translate(., $upper-case, $lower-case), $punctuation, '')"/>
                    <xsl:with-param name="word">can</xsl:with-param>
                </xsl:call-template>
            </count>
        </xsl:for-each>
    </results>
</xsl:template>

<xsl:template name="count-word-occurrences">
    <xsl:param name="text"/>
    <xsl:param name="word"/>
    <xsl:param name="delimiter" select="' '"/>
    <xsl:param name="count" select="0"/>
    
    <xsl:variable name="token" select="substring-before(concat($text, $delimiter), $delimiter)" />
    <xsl:variable name="new-count" select="$count + ($token = $word)" />
    
    <xsl:choose>
        <xsl:when test="contains($text, $delimiter)">
            <!-- recursive call -->
            <xsl:call-template name="count-word-occurrences">
                <xsl:with-param name="text" select="substring-after($text, $delimiter)"/>
                <xsl:with-param name="word" select="$word"/>
                <xsl:with-param name="count" select="$new-count"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$new-count"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="UTF-8"?>
<results>
  <count>6</count>
</results>

Caveats:

  1. The upper to lower case conversion is limited to lower ASCII characters;
  2. The list of punctuation characters is incomplete;
  3. Beware of punctuation characters that can come instead of a space (e.g. a hyphen).

Upvotes: 3

Related Questions