basZero
basZero

Reputation: 4284

How would you count different substrings in XSLT?

I have the following XML snippet:

   <wrapper>
      <item timestamp="19.10.2011 12:05">
         <comment>Used for orderID '011187' with item 'xyz1'</comment>
      </item>
      <item timestamp="01.06.2012 16:25">
         <comment>Used for orderID '011379' with item 'xyz2'</comment>
      </item>
      <item timestamp="06.06.2012 14:32">
         <comment>Used for orderID '011382' with item 'xyz2'</comment>
      </item>
   </wrapper>

I want to know how many of each item occurs. In this case:
- 1 x xyz1
- 2 x xyz2

So somehow you have to loop over all <item>, extract the string which is between the quotes (') after the text with item...and then count how many times that string occurs in total inside the wrapper element.

How would you solve this in XSLT?

Upvotes: 2

Views: 446

Answers (3)

Daniel Haley
Daniel Haley

Reputation: 52878

Here's another XSLT 1.0 option using Muenchian grouping (xsl:key)...

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text"/>
  <xsl:strip-space elements= "*"/>

  <xsl:key name="items" match="comment" 
    use="substring-after(normalize-space(),'with item ')"/>

  <xsl:template match="/wrapper">
    <xsl:for-each select="item/comment[count(.|key('items',
      substring-after(normalize-space(),'with item '))[1])=1]">
      <xsl:variable name="items" select="key('items',
        substring-after(normalize-space(),'with item '))"/>
      <xsl:value-of select='concat("- ",count($items)," x ",
        translate(substring-after(normalize-space(),"with item "),"&apos;",""),
        "&#xA;")'/>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

Example (since everyone else is doing it ;-): http://xsltransform.net/93dEHFs

Upvotes: 2

G_H
G_H

Reputation: 12009

For XSLT 2 you would do best to go with fafl's answer. If you're stuck with XSLT 1, the following would work:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="text" encoding="UTF-8" indent="yes" />

    <xsl:template match="node()|@*">
      <xsl:apply-templates select="node()|@*" />
    </xsl:template>

    <xsl:template match="comment">
        <xsl:variable name="item" select='substring-before(substring-after(., "item &apos;"), "&apos;")' />
        <xsl:variable name="quotedItem" select='concat("item &apos;", $item, "&apos;")' />
        <xsl:if test='generate-id(.) = generate-id(//comment[contains(text(), $quotedItem)])'>
            <xsl:value-of select='count(//comment[contains(text(), $quotedItem)])' />
            <xsl:text> x </xsl:text>
            <xsl:value-of select="$item" />
            <xsl:text>&#13;&#10;</xsl:text>
        </xsl:if>
    </xsl:template>

</xsl:transform>

Let's break it down a bit. The first template quite simply applies templates recursively. The second template matches all <comment> elements.

First the item name is extracted by taking the substring after item ' and then taking the substring before ' in that result. This assumes that the item name always occurs in the form item 'name'. If not, you'll need to adjust this. The result is assigned to variable item. Note that the use of single quotes makes this a bit tricky, as double and single quotes are XML markup. So the select attribute value is put between single quotes instead of the standard double quotes, and the single quotes actually intended as text are referenced via &apos;.

Then a variable named quotedItem is assigned which is basically the string item 'name' (with name the actual item value), to make things a bit easier later on. It avoids matching the item name outside of quotes, or partial matches (for example if one comment contained item 'xy' and another item 'xyz'). Again, this makes assumptions of the input.

Then the test in the if element checks if the generated id of the current <comment> is the same as the generated id of the last <comment> that contains the quotedItem substring, so that only for the last occurrence of each item an action is taken. The action, in this case, is to count all the <comment> elements which contain the quotedItem substring, and output that as count x item followed by a carriage return and newline.

The core parts are the variables and the generated-id trick. The rest will depend on what you mean to do with the results.

xsltransform link: http://xsltransform.net/a9Giwm

Upvotes: 2

fafl
fafl

Reputation: 7385

In XSLT 2.0 this works:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>

    <xsl:template match="wrapper">
      <xsl:for-each-group select="item/comment"
           group-by="replace(tokenize(., ' ')[last()], '[^a-zA-Z0-9]', '')">
        <xsl:sort select="current-grouping-key()"/>
        <xsl:value-of select="concat('- ', count(current-group()), ' x ', current-grouping-key(), '&#xa;')"/>
      </xsl:for-each-group>
    </xsl:template>

</xsl:stylesheet>

Fiddle: http://xsltransform.net/pNmBxZD

Upvotes: 2

Related Questions