NovumCoder
NovumCoder

Reputation: 4657

XSLT: add up number and print subtotal multiple times

I'm a beginner in XSLT and figured out that I cannot just add up numbers to a variable and change its value in any way.

I have a XML document with a list of numbers i need to add up until the element matches a specific attribute value, then print that number reset it to 0 and continue adding up the rest until i see that specific attribute again.

For example i have this XML:

<list>
 <entry>
  <field type="num" value="189.5" />
 </entry>
 <entry>
  <field type="num" value="1.5" />
 </entry>
 <entry>
  <field type="summary" />
 </entry>
 <entry>
  <field type="num" value="9.5" />
 </entry>
 <entry>
  <field type="num" value="11" />
 </entry>
 <entry>
  <field type="num" value="10" />
 </entry>
 <entry>
  <field type="summary" />
 </entry>
</list>

Now i want my XSLT to print this:

189.5
1.5
#191#
9.5
11
10
#30.5#

I have read that i can do that by using sum() with conditions. I know how to use for-each and point to the elements relatively and iam also able to use sum() by simply summarizing all having type=num, but how to sum only first num until type=summary comes up, then next sum only from last type=summary until the next one?

I would expect something like this:

<xsl:for-each select="list/entry">
 <xsl:if test="field[@type='summary']">
  <!-- we are now at a type=summary element, now sum up -->
  #<xsl:value-of select="sum(WHAT_TO_PUT_HERE?)" />#
 </xsl:if>
 <xsl:if test="field[@type='num']">
  <xsl:value-of select="field/@value" />
 </xsl:if>
</xsl:for-each>

Appreciate any help.

Upvotes: 1

Views: 1585

Answers (4)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243529

I. Here is a simple, forward-only solution -- do note that no reverse axis is used and the time complexity is just O(N) and the space complexity is just O(1).

This is probably the simplest and fastest of all presented solutions:

No monstrous complexity or grouping is required at all ...

No variables, no keys (and no space taken for caching key->values), no sum() ...

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

  <xsl:template match="/*"><xsl:apply-templates select="*[1]"/></xsl:template>

  <xsl:template match="entry[field/@type = 'num']">
    <xsl:param name="pAccum" select="0"/>
    <xsl:value-of select="concat(field/@value, '&#xA;')"/>
    <xsl:apply-templates select="following-sibling::entry[1]">
      <xsl:with-param name="pAccum" select="$pAccum+field/@value"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="entry[field/@type = 'summary']">
    <xsl:param name="pAccum" select="0"/>
    <xsl:value-of select="concat('#', $pAccum, '#&#xA;')"/>  
    <xsl:apply-templates select="following-sibling::entry[1]"/>
  </xsl:template>
</xsl:stylesheet>

This is an example of a streaming transformation -- it doesn't require the complete XML document tree to be present in memory and can be used to process documents of indefinite or infinite length.

When the transformation is applied on the provided source XML document:

<list>
    <entry>
        <field type="num" value="189.5" />
    </entry>
    <entry>
        <field type="num" value="1.5" />
    </entry>
    <entry>
        <field type="summary" />
    </entry>
    <entry>
        <field type="num" value="9.5" />
    </entry>
    <entry>
        <field type="num" value="11" />
    </entry>
    <entry>
        <field type="num" value="10" />
    </entry>
    <entry>
        <field type="summary" />
    </entry>
</list>

the wanted, correct result is produced:

189.5
1.5
#191#
9.5
11
10
#30.5#

II. Update

The transformation above when run on sufficiently-big XML documents and with XSLT processors that don't optimize tail-recursion, causes stack overflow, due to a long chain of <xsl:apply-templates>

Below is another transformation, which doesn't cause stack overflow even with extremely big XML documents. Again, no reverse axes, no keys, no "grouping", no conditional instructions, no count(), no <xsl:variable> ...

And, most importantly, compared with the "efficient", key-based Muenchian grouping, this transformation takes only 61% of the time of the latter, when run on an XML document having 105 000 (105 thousand) lines:

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="/*">
  <xsl:apply-templates select=
  "*[1] | entry[field/@type = 'summary']/following-sibling::*[1]"/>
 </xsl:template>

  <xsl:template match="entry[field/@type = 'num']">
    <xsl:param name="pAccum" select="0"/>

    <xsl:value-of select="concat(field/@value, '&#xA;')"/>

    <xsl:apply-templates select="following-sibling::entry[1]">
        <xsl:with-param name="pAccum" select="$pAccum+field/@value"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="entry[field/@type = 'summary']">
    <xsl:param name="pAccum" select="0"/>

    <xsl:value-of select="concat('#', $pAccum, '#&#xA;')"/>
 </xsl:template>
</xsl:stylesheet>

Additionally, this transformation can be speeded to take less than 50% (that is, make it more than twice as fast) of the time taken by the Muenchian grouping transformation, by replacing every element name by just *

A lesson for us all to learn: A non-key solution sometimes can be more efficient than a key-based one.

Upvotes: 6

matthias_h
matthias_h

Reputation: 11416

Just as a different solution to the grouping suggested as comment - you could also use match patterns to get the sums:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="field[@type='num']">
  <xsl:value-of select="@value"/>
<xsl:text>&#x0A;</xsl:text>
  </xsl:template>
  <xsl:template match="entry[field[@type='summary']]">
  <xsl:variable name="sumCount" select="count(preceding-sibling::entry[field[@type='summary']])"/>
     <xsl:text>#</xsl:text>
     <xsl:value-of select="sum(preceding-sibling::entry[count(preceding-sibling::entry[field[@type='summary']]) = $sumCount]/field[@type='num']/@value)"/>
    <xsl:text>#&#x0A;</xsl:text>     
  </xsl:template>
</xsl:transform>

When applied to your input XML this produces the output

189.5
1.5
#191#
9.5
11
10
#30.5#

The template matching field[@type='num'] prints the value and adds a newline, and the template matching entry[field[@type='summary']] uses the variable

<xsl:variable name="sumCount" select="count(preceding-sibling::entry[field[@type='summary']])"/>

to check how many previous fields of the type summary occured. Then only the sum of all values of entries of the type num with the same amount of preceding summary fields is printed:

<xsl:value-of select="sum(preceding-sibling::entry[
                      count(preceding-sibling::entry[field[@type='summary']]) = $sumCount
                      ]/field[@type='num']/@value)"/>

Update: To explain in more detail how this works as requested: In the template matching entry[field[@type='summary']] the variable sumCount counts all previous entries that have a field of type summary:

count(preceding-sibling::entry[field[@type='summary']])

So when the template matches the first summary field, the value of sumCount is 0, and when matching the second summary field, sumCount is 1.
The second line using the sum function

sum(
    preceding-sibling::entry
     [
      count(preceding-sibling::entry[field[@type='summary']]) = 
      $sumCount
     ]
     /field[@type='num']/@value
   )

sums all field[@type='num']/@value for all previous (preceding) entries that have the same amount of previous fields of type summary as the current field of type summary:

count(preceding-sibling::entry[field[@type='summary']]) = $sumCount

So when the second summary is matched, only the values of the num fields with the values 9.5, 10 and 11 will be summarized as they have the same amount of previous summary fields as the current summary field.
For the num fields with the values 189.5 and 1.5,

count(preceding-sibling::entry[field[@type='summary']]) 

is 0, so these fields are omitted in the sum function.

Upvotes: 2

leu
leu

Reputation: 2081

too late to the party and almost the same as matthias_h did:

<?xml version="1.0" encoding="utf-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text"/>

  <xsl:template match="//field[@type='num']">
    <xsl:value-of select="concat(@value,'&#x0a;')"/>
  </xsl:template>

  <xsl:template match="//field[@type='summary']">
    <xsl:variable name="prevSumCnt" select="count(preceding::field[@type='summary'])"/>
    <xsl:variable name="sum" select="sum(preceding::field[count(preceding::field[@type='summary'])=$prevSumCnt]/@value)"/>
    <xsl:value-of select="concat('#',$sum,'#&#x0a;')"/>
  </xsl:template>

  <xsl:template match="text()"/>
</xsl:transform>

the idea is to sum all fields that have the same number of summary-fields before them than the actual summary-field...

Upvotes: 0

michael.hor257k
michael.hor257k

Reputation: 117073

You need a variation on Muenchian grouping. Start by defining a key as:

<xsl:key name="numbers" match="entry[field/@type='num']" use="generate-id(following-sibling::entry[field/@type='summary'][1])" />

then use:

#<xsl:value-of select="sum(key('numbers', generate-id())/field/@value)" />#

to sum the numbers in the current group.

Upvotes: 1

Related Questions