Alejandro DC
Alejandro DC

Reputation: 236

Counting and filtering based on it, inside an xsl:for-each

I have a source XML generated automatically by a clumsy program, which contains repeated and empty entries.

I want to be able to select and show ONLY the entry the most represented, filtering the empty entries. For example, suppose my source XML is

<catalog>
    <letter>
      <char>A</char>
    </letter>
    <letter>
        <char>B</char>
    </letter>
    <letter>
        <char></char>
    </letter>
    <letter>
        <char>A</char>
    </letter>
    <letter>
        <char></char>
    </letter>
    <letter>
        <char></char>
    </letter>
</catalog>

Then, I want to get only A.

I can do:

<table><tr>
  <xsl:for-each select="catalog/letter[char!='']">
    <td><xsl:value-of select="char" /></td>
  </xsl:for-each>
</tr></table>

which will show me a table containing only A, B, A (not the empty columns).

Now I want to get only A, because it is the one that appears most often. How do I do it?

Update

Here is an attempt, which does not work.

<xsl:variable name="counter" select="0" />
<xsl:variable name="this" select="" />
<xsl:for-each select="catalog/letter[char!='']">
  <xsl:variable name="var" select="char" />
  <xsl:variable name="cou" select="count(/catalog/letter[char=$var])" />
  <xsl:if test="$cou &gt; counter">
    <xsl:variable name="this" select="$var" />
  </xsl:if>
<td><xsl:value-of select="$this" /></td>
</xsl:for-each>

Upvotes: 0

Views: 667

Answers (1)

Tim C
Tim C

Reputation: 70648

The first thing to mention is that in XSLT that variables are immutable. They cannot be changed once set, so if were trying to have a "counter" variable, it would not work.

To fix your current XSLT you could use the xsl:sort command for the xsl:for-each to sort on the number of letter elements with the same char, and then just pick the first one. Like

  <xsl:for-each select="catalog/letter[char!='']">
     <xsl:sort select="count(/catalog/letter[char=current()/char])" order="descending" />
     <xsl:if test="position() = 1">
        <xsl:value-of select="."/>
     </xsl:if>
  </xsl:for-each>

This, however, is very inefficient because you will be repeatedly be counting the same nodes over and over again.

A better technique is to use Muenchian Grouping, which involves defining a key to look up the elements

<xsl:key name="letter" match="letter[char!='']" use="char" />

You then use this key to find each distinct letter, and also to count the number of letters with the same char.

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="xml" indent="yes"/>
   <xsl:key name="letter" match="letter[char!='']" use="char" />

   <xsl:template match="/">
      <xsl:for-each select="catalog/letter[generate-id() = generate-id(key('letter', char)[1])]">
         <xsl:sort select="count(key('letter', char))" order="descending" />
         <xsl:if test="position() = 1">
            <xsl:value-of select="."/>
         </xsl:if>
      </xsl:for-each>
   </xsl:template>
</xsl:stylesheet>

This should output just A.

Upvotes: 2

Related Questions