Warjeh
Warjeh

Reputation: 1220

XSL Sort Numbers Separated by Periods

I have a problem with sorting numbers, separated by periods (e.g. 1, 2.1, 1.1, 1.3). I found a solution here XSL recursive sort. That is what I need, slightly different. In my xml, tags are like

 <root>
   <row>
       <col name="rank"/>
       <name>A</name>
       <val>1.1</val>
   </row>
   <row>
       <col name="rank"/>
       <name>B</name>
       <val>1</val>
   </row>
   <row>
       <col name="level"/>
       <name>C</name>
       <val>test</val>
   </row>
   <row>
       <col name="rank"/>
       <name>D</name>
       <val>1.2.2</val>
   </row>
   <row>
       <col name="rank"/>
       <name>E</name>
       <val>1.2.1</val>
    </row>
   <row>
       <col name="rank"/>
       <name>F</name>
       <val>1.2</val>
    </row>
 </root>

and I want to sort all rows which col/@name = "rank" based on "val" tags. Is it possible to get the output with only modifying accepted answer in linked question? If not, is there any solution with xsl version 1 (or 2 if there is none). The output I need is like:

<ul>
   <li>1 - B
     <ul>
       <li>1.1 - A</li>
       <li>1.2 - F
          <ul>
             <li>1.2.1 - E</li>
             <li>1.2.2 - D</li>
          </ul>
       </li>
    </ul>      
   </li>
</ul>

Update I: So based on michael.hor257k's answer, this is the solution I was looking for:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html" version="5.0" encoding="UTF-8" indent="yes"/>
  <xsl:template match="/root">
    <html>
    <body>
      <ul>
        <xsl:apply-templates select="row[not(contains(val, '.'))][contains(col/@name, 'rank')]">
          <xsl:sort select="val" data-type="number" order="ascending"/>
        </xsl:apply-templates>
      </ul>
    </body>    
    </html>      
  </xsl:template>

  <xsl:template match="row">
    <li>
      <xsl:variable name="parent" select="concat(val, '.')" />
      <xsl:value-of select="./name"/> - <xsl:value-of select="./val"/>
      <xsl:if test="../row[starts-with(val, $parent)][not(contains(substring-after(val, $parent), '.'))][contains(col/@name, 'rank')]">
        <ul>
          <xsl:apply-templates select="../row[starts-with(val, $parent)][not(contains(substring-after(val, $parent), '.'))][contains(col/@name, 'rank')]">
            <xsl:sort select="substring-after(val, $parent)" data-type="number" order="ascending"/>
          </xsl:apply-templates>
        </ul>
      </xsl:if>
    </li>
  </xsl:template>
</xsl:stylesheet> 

Update II: Thanks to Dimitre Novatchev, I have a better solution which I think it is the best answer. So, I am trying to understand it, after that I'll check it as the accepted answer.

Update III: I accepted the answer michael.hor257k posted because it was what I needed. I know there is no jumps in orders of ranks in my xml but as Dimitre Novatchev mentioned, if there is, like having 1.3.2 with no 1.3 there will be a problem with this solution and you can use the complete answer which Dimitre Novatchev posted.

Upvotes: 0

Views: 430

Answers (5)

Michael Kay
Michael Kay

Reputation: 163549

And here's another candidate if you're able to take advantage of it: the new fn:sort function in XPath 3.1 allows use of a composite sort key so you can write

sort(val, function($x){tokenize($x, '\.')!number()})

Upvotes: 0

Michael Kay
Michael Kay

Reputation: 163549

Check whether your chosen XSLT processor supports a numeric collation (a way of sorting strings in which consecutive sequences of digits are treated as numbers, so "Chapter 2" sorts before "Chapter 10"). For example with the UCA collation URIs defined in XSLT 3.0 and XPath 3.1 this would be

    <xsl:sort select="val" 
         collation="http://www.w3.org/2013/collation/UCA?numeric=yes"/>

Numeric collations have also been available in Saxon for some years in the form

collation="http://saxon.sf.net/collation?alphanumeric=yes"

Upvotes: 0

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243549

I have a problem with sorting numbers, separated by periods (e.g. 1, 2.1, 1.1, 1.3). I found a solution here XSL recursive sort.

Part I. The sorting

It is very easy to adapt the original solution to the new case. Unlike the accepted answer, this solution sorts correctly XML documents, where there is

<val>1.3.2</val>

but there is no

<val>1.3</val>

See Part II for transforming the sorted result into the wanted nested list structure.


<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

    <xsl:template match="root">
        <xsl:copy>
            <xsl:apply-templates select="row">
                <xsl:sort select="substring-before(concat(val, '.'), '.')" 
                     data-type="number"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="row">
        <xsl:param name="prefix" select="''"/>
        <xsl:choose>
            <!-- end of recursion, there isn't any more row with more chunks -->
            <xsl:when test="val = substring($prefix, 1, string-length($prefix)-1)">
                <xsl:copy-of select="."/>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="chunk" select=
                 "substring-before(concat(substring-after(val, $prefix), '.'), '.')"/>
                <!-- this tests for grouping row with same prefix, to skip duplicates -->
                <xsl:if test=
                "not(preceding-sibling::row[starts-with(val, concat($prefix, $chunk))])">
                    <xsl:variable name="new-prefix" 
                                          select="concat($prefix, $chunk, '.')"/>
                    <xsl:apply-templates select=
              "../row[starts-with(val, $new-prefix) or val = concat($prefix, $chunk)]">
                        <xsl:sort select= 
              "substring-before(concat(substring-after(val, $new-prefix), '.'), '.')" 
                                  data-type="number"/>
                        <xsl:with-param name="prefix" select="$new-prefix"/>
                    </xsl:apply-templates>
                </xsl:if>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document -- note there is <val>1.3.2</val> but there is no <val>1.3</val> and the accepted answer doesn't produce correct result -- actually deletes the whole <row> having a <val>1.3.2</val> child:

<root>
    <row>
        <col name="rank"/>
        <name>A</name>
        <val>1.1</val>
    </row>
    <row>
        <col name="rank"/>
        <name>B</name>
        <val>1</val>
    </row>
    <row>
        <col name="rank"/>
        <name>F</name>
        <val>1.10</val>
    </row>
    <row>
        <col name="level"/>
        <name>C</name>
        <val>2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>D</name>
        <val>1.2.2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>D</name>
        <val>1.3.2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>E</name>
        <val>1.2.1</val>
    </row>
    <row>
        <col name="rank"/>
        <name>F</name>
        <val>1.2</val>
    </row>
</root>

the wanted, correctly sorted result is produced:

<root>
   <row>
      <col name="rank"/>
      <name>B</name>
      <val>1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>A</name>
      <val>1.1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>F</name>
      <val>1.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>E</name>
      <val>1.2.1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>D</name>
      <val>1.2.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>D</name>
      <val>1.3.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>F</name>
      <val>1.10</val>
   </row>
   <row>
      <col name="level"/>
      <name>C</name>
      <val>2</val>
   </row>
</root>

Finally, one more refactoring: eliminating all XSLT conditional operators:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="/*">
        <xsl:copy>
            <xsl:apply-templates select="row">
                <xsl:sort select="substring-before(concat(val, '.'), '.')" 
                          data-type="number"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="row">
        <xsl:param name="prefix" select="''"/>
        <xsl:variable name="vHasChildren" select=
                           "not(val = substring($prefix, 1, string-length($prefix)-1))"/>
        <xsl:copy-of select="self::node()[not($vHasChildren)]"/>

        <xsl:variable name="chunk" 
             select="substring-before(concat(substring-after(val, $prefix), '.'), '.')"/>
        <xsl:variable name="new-prefix" select="concat($prefix, $chunk, '.')"/>
        <xsl:apply-templates select= "self::node()
            [$vHasChildren 
           and not(preceding-sibling::row[starts-with(val, concat($prefix, $chunk))])
            ]
             /../row[starts-with(val, $new-prefix) or val = concat($prefix, $chunk)]">
            <xsl:with-param name="prefix" select="$new-prefix"/>
            <xsl:sort data-type="number" select=
                "substring-before(concat(substring-after(val, $new-prefix), '.'), '.')"/>
        </xsl:apply-templates>
    </xsl:template>
</xsl:stylesheet>

Part II: Transforming the sorted, flat result into a nested list structure

Here we start with the result of the transformation produced in Part I, and from it we produce the wanted nested list structure. The sorted flat result we have so far is:

<root>
   <row>
      <col name="rank"/>
      <name>B</name>
      <val>1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>A</name>
      <val>1.1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>F</name>
      <val>1.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>E</name>
      <val>1.2.1</val>
   </row>
   <row>
      <col name="rank"/>
      <name>D</name>
      <val>1.2.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>D</name>
      <val>1.3.2</val>
   </row>
   <row>
      <col name="rank"/>
      <name>F</name>
      <val>1.10</val>
   </row>
   <row>
      <col name="level"/>
      <name>C</name>
      <val>2</val>
   </row>
</root>

We use this transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="/*">
    <list>
      <xsl:apply-templates select=
      "row[not(substring-before(concat(val, '.'), '.') 
              = substring-before(concat(preceding-sibling::row[1]/val,'.'),'.'))]">
       <xsl:with-param name="pPrefix" select="''"/>
      </xsl:apply-templates>
    </list>
  </xsl:template>


  <xsl:template match="row">
    <xsl:param name="pPrefix"/>
    <item val="{val}">
      <xsl:variable name="vnewPrefix" select="concat($pPrefix, val, '.')"/>
      <xsl:variable name="vcurrentVal" select="val"/>
      <xsl:apply-templates select="following-sibling::row
                  [starts-with(val, concat($vcurrentVal,'.'))
                 and
                   (string-length(val) - string-length(translate(val,'.','')) 
                    = 1 + string-length($vcurrentVal) - string-length(translate($vcurrentVal,'.','')
                    )
                   or
                    not(starts-with(val, 
                                    concat($vnewPrefix, 
                                           substring-before(concat(substring-after(preceding-sibling::row[1]/val, $vnewPrefix),'.'),'.'),
                                                            '.')
                                    )
                        )
                   )
                  ]">
         <xsl:with-param name="pPrefix" select="$vnewPrefix"/>
      </xsl:apply-templates>
    </item>
  </xsl:template>
</xsl:stylesheet>

The result of applying this transformation on the above XML document is the wanted nested list structure:

<list>
   <item val="1">
      <item val="1.1"/>
      <item val="1.2">
         <item val="1.2.1"/>
         <item val="1.2.2"/>
      </item>
      <item val="1.3.2"/>
      <item val="1.10"/>
   </item>
   <item val="2"/>
</list>

We can similarly produce the wanted HTML, using this transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="/*">
    <ul>
      <xsl:apply-templates select=
      "row[not(substring-before(concat(val, '.'), '.') 
              = substring-before(concat(preceding-sibling::row[1]/val,'.'),'.'))]">
       <xsl:with-param name="pPrefix" select="''"/>
      </xsl:apply-templates>
    </ul>
  </xsl:template>

  <xsl:template match="row">
    <xsl:param name="pPrefix"/>
    <li> <xsl:value-of select="concat(val, ' - ', name, '&#xA;')"/>
      <xsl:variable name="vnewPrefix" select="concat($pPrefix, val, '.')"/>
      <xsl:variable name="vcurrentVal" select="val"/>

      <xsl:variable name="vnextInChain" select=
      "following-sibling::row
                  [starts-with(val, concat($vcurrentVal,'.'))
                 and
                   (string-length(val) - string-length(translate(val,'.','')) 
                    = 1 + string-length($vcurrentVal) - string-length(translate($vcurrentVal,'.','')
                    )
                   or
                    not(starts-with(val, 
                                    concat($vnewPrefix, 
                                           substring-before(concat(substring-after(preceding-sibling::row[1]/val, $vnewPrefix),'.'),'.'),
                                                            '.')
                                    )
                        )
                   )
                  ]"/>

      <xsl:if test="$vnextInChain">
       <ul>
        <xsl:apply-templates select="following-sibling::row
                  [starts-with(val, concat($vcurrentVal,'.'))
                 and
                   (string-length(val) - string-length(translate(val,'.','')) 
                    = 1 + string-length($vcurrentVal) - string-length(translate($vcurrentVal,'.','')
                    )
                   or
                    not(starts-with(val, 
                                    concat($vnewPrefix, 
                                           substring-before(concat(substring-after(preceding-sibling::row[1]/val, $vnewPrefix),'.'),'.'),
                                                            '.')
                                    )
                        )
                   )
                  ]">
         <xsl:with-param name="pPrefix" select="$vnewPrefix"/>
        </xsl:apply-templates>
       </ul>
      </xsl:if>
    </li>
  </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the flat sorted result, the wanted HTML result is produced:

<ul>
    <li>1 - B
        <ul>
            <li>1.1 - A
            </li>
            <li>1.2 - F
                <ul>
                    <li>1.2.1 - E
                    </li>
                    <li>1.2.2 - D
                    </li>
                </ul></li>
            <li>1.3.2 - D
            </li>
            <li>1.10 - F
            </li>
        </ul></li>
    <li>2 - C
    </li>
</ul>

Upvotes: 0

Martin Honnen
Martin Honnen

Reputation: 167716

Is the number of levels known? If so, in XSLT 2.0 you could use

<xsl:apply-templates select="row[col/@name = 'rank']">
<xsl:sort select="xs:integer(tokenize(val, '\'.')[1])"/>
<xsl:sort select="xs:integer(tokenize(val, '\'.')[2])"/>
<xsl:sort select="xs:integer(tokenize(val, '\'.')[3])"/>
</xsl:apply-templates/>

for three levels. In XSLT 3.0 you could even do it with the sort function for any level: <xsl:apply-templates select="sort(row[col/@name = 'rank'], function($row) { tokenize($row/val, '\.')!xs:integer(.) })"> although as you also want to nest I think using a recursive function doing

<xsl:for-each-group select="$rows" group-by="xs:integer(tokenize(val, '\.')[1])"><xsl:sort select="current-grouping-key()"/>...</xsl:for-each-group>

is more suitable in XSLT 2.0 or 3.0 then pure sorting.

A complete stylesheet is

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs mf" version="2.0">

    <xsl:output method="html" indent="yes"/>

    <xsl:function name="mf:nest" as="element()*">
        <xsl:param name="rows" as="element(row)*"/>
        <xsl:sequence select="mf:nest($rows, 1)"/>
    </xsl:function>

    <xsl:function name="mf:nest" as="element()*">
        <xsl:param name="rows" as="element(row)*"/>
        <xsl:param name="level" as="xs:integer"/>
        <xsl:for-each-group select="$rows" group-by="xs:integer(tokenize(val, '\.')[$level])">
            <xsl:sort select="current-grouping-key()"/>
            <li>
                <xsl:variable name="item" select="current-group()[not(tokenize(val, '\.')[$level + 1])]"/>
                <xsl:value-of select="$item/concat(name, ' - ', val)"/>
                <xsl:if test="current-group()[2]">
                    <ul>
                        <xsl:sequence select="mf:nest(current-group() except $item, $level + 1)"/>
                    </ul>
                </xsl:if>
            </li>
        </xsl:for-each-group>
    </xsl:function>

    <xsl:template match="root">
        <ul>
            <xsl:sequence select="mf:nest(row[col/@name = 'rank'])"/>
        </ul>
    </xsl:template>

</xsl:stylesheet>

it transforms the input

<root>
    <row>
        <col name="rank"/>
        <name>A</name>
        <val>1.1</val>
    </row>
    <row>
        <col name="rank"/>
        <name>B</name>
        <val>1</val>
    </row>
    <row>
        <col name="level"/>
        <name>C</name>
        <val>test</val>
    </row>
    <row>
        <col name="rank"/>
        <name>D</name>
        <val>1.2.2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>E</name>
        <val>1.2.1</val>
    </row>
    <row>
        <col name="rank"/>
        <name>foo</name>
        <val>2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>bar</name>
        <val>1.10</val>
    </row>
    <row>
        <col name="rank"/>
        <name>F</name>
        <val>1.2</val>
    </row>
    <row>
        <col name="rank"/>
        <name>F</name>
        <val>1.10.1</val>
    </row>
</root>

into the result

<ul>
   <li>B - 1
      <ul>
         <li>A - 1.1</li>
         <li>F - 1.2
            <ul>
               <li>E - 1.2.1</li>
               <li>D - 1.2.2</li>
            </ul>
         </li>
         <li>bar - 1.10
            <ul>
               <li>F - 1.10.1</li>
            </ul>
         </li>
      </ul>
   </li>
   <li>foo - 2</li>
</ul>

Upvotes: 2

michael.hor257k
michael.hor257k

Reputation: 117102

Consider the following stylesheet:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/root">
    <list>
        <xsl:apply-templates select="row[not(contains(val, '.'))]">
            <xsl:sort select="val" data-type="number" order="ascending"/>
        </xsl:apply-templates>
    </list>
</xsl:template>

<xsl:template match="row">
    <xsl:variable name="parent" select="concat(val, '.')" />
    <item val="{val}">
        <xsl:apply-templates select="../row[starts-with(val, $parent)][not(contains(substring-after(val, $parent), '.'))]">
            <xsl:sort select="substring-after(val, $parent)" data-type="number" order="ascending"/>
        </xsl:apply-templates>
    </item>
</xsl:template>

</xsl:stylesheet> 

Applied to the following input example:

XML

<root>
   <row>
       <col name="rank"/>
       <name>A</name>
       <val>1.1</val>
   </row>
   <row>
       <col name="rank"/>
       <name>B</name>
       <val>1</val>
   </row>
   <row>
       <col name="rank"/>
       <name>F</name>
       <val>1.10</val>
    </row>
   <row>
       <col name="level"/>
       <name>C</name>
       <val>2</val>
   </row>
   <row>
       <col name="rank"/>
       <name>D</name>
       <val>1.2.2</val>
   </row>
   <row>
       <col name="rank"/>
       <name>E</name>
       <val>1.2.1</val>
    </row>
   <row>
       <col name="rank"/>
       <name>F</name>
       <val>1.2</val>
    </row>
 </root>

the result will be:

<?xml version="1.0" encoding="UTF-8"?>
<list>
   <item val="1">
      <item val="1.1"/>
      <item val="1.2">
         <item val="1.2.1"/>
         <item val="1.2.2"/>
      </item>
      <item val="1.10"/>
   </item>
   <item val="2"/>
</list>

This works recursively and there is no limit on the number of levels. Note however, that each item (other than "ancestor" items that do not contain a dot) must have a parent.

Upvotes: 2

Related Questions