Reputation: 5471
I am trying to sort xml files based on 4 dimensions - nodenames, attribute names, attribute values, and lastly based on node values.
My XML
<NodeRoot>
<NodeA class="3">
<NodeB>
<NodeC abc="1">103</NodeC>
<NodeD>103</NodeD>
<NodeC pqr="2">101</NodeC>
<NodeC pqr="1">102</NodeC>
<NodeD>101</NodeD>
</NodeB>
</NodeA>
<NodeA class="1">
<NodeGroup>
<NodeC name="z" asc="2">103</NodeC>
<NodeC name="b">101</NodeC>
<NodeC name="a">102</NodeC>
</NodeGroup>
</NodeA>
</NodeRoot>
My XSL
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="utf-8" method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
<xsl:apply-templates select="node()">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Current Output
<NodeRoot>
<NodeA class="1">
<NodeGroup>
<NodeC name="b">101</NodeC>
<NodeC name="a">102</NodeC>
<NodeC asc="2" name="z">103</NodeC>
</NodeGroup>
</NodeA>
<NodeA class="3">
<NodeB>
<NodeC pqr="2">101</NodeC>
<NodeC pqr="1">102</NodeC>
<NodeC abc="1">103</NodeC>
<NodeD>101</NodeD>
<NodeD>103</NodeD>
</NodeB>
</NodeA>
</NodeRoot>
Expected Outcome
<NodeRoot>
<NodeA class="1">
<NodeGroup>
<NodeC asc="2" name="z">103</NodeC>
<NodeC name="a">102</NodeC>
<NodeC name="b">101</NodeC>
</NodeGroup>
</NodeA>
<NodeA class="3">
<NodeB>
<NodeC abc="1">103</NodeC>
<NodeC pqr="1">102</NodeC>
<NodeC pqr="2">101</NodeC>
<NodeD>101</NodeD>
<NodeD>103</NodeD>
</NodeB>
</NodeA>
</NodeRoot>
Test XSLT --> http://xsltransform.net/naZXpY7
Upvotes: 3
Views: 1829
Reputation: 5471
Based on @C. M. Sperberg-McQueen's suggestion of node-set extension and with the help of the example found at https://www.xml.com/pub/a/2003/07/16/nodeset.html, I came up with a single xsl which merges the McQueen's 2 xsls.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" exclude-result-prefixes="exslt" xmlns:PJ="http://example.com/PankajJaju">
<xsl:output encoding="utf-8" method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*" name="first-pass">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:if test="self::*">
<xsl:attribute name="PJ:attribute-names" namespace="http://example.com/PankajJaju">
<xsl:call-template name="attribute-name-list"/>
</xsl:attribute>
<xsl:attribute name="PJ:attribute-values" namespace="http://example.com/PankajJaju">
<xsl:call-template name="attribute-value-list"/>
</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template name="attribute-name-list">
<xsl:for-each select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(local-name(), ' ')"/>
</xsl:for-each>
</xsl:template>
<xsl:template name="attribute-value-list">
<xsl:for-each select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(string(), ' ')"/>
</xsl:for-each>
</xsl:template>
<xsl:template match="/">
<xsl:variable name="process-one">
<xsl:call-template name="first-pass"/>
</xsl:variable>
<xsl:apply-templates select="exslt:node-set($process-one)" mode="second-pass"/>
</xsl:template>
<xsl:template match="@*|node()" mode="second-pass">
<xsl:copy>
<xsl:apply-templates select="node()|@*" mode="second-pass"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*" mode="second-pass">
<xsl:copy>
<xsl:apply-templates select="@*" mode="second-pass">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
<xsl:apply-templates select="node()" mode="second-pass">
<xsl:sort select="local-name()"/>
<xsl:sort select="@PJ:attribute-names"/>
<xsl:sort select="@PJ:attribute-values"/>
<xsl:sort select="."/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="@PJ:attribute-names | @PJ:attribute-values" mode="second-pass"/>
</xsl:stylesheet>
Upvotes: 1
Reputation: 25034
You're currently sorting all attributes of an element by local name and value, and then all children (again by local name and string value).
So far, so good.
One difficulty you face is what exactly you mean by sorting by "attribute names". From your example, it looks as if you want elements sorted by a list of their attribute names in alphabetic order, so that the sort keys for the children of your NodeGroup element are
'NodeC', 'asc name', '2 z', 103
'NodeC', 'name', 'a', 102
'NodeC', 'name', 'b', 201
The next difficulty is that there's no obvious way to obtain the value 'asc name' from an XPath 1.0 expression with the first NodeC of your NodeGroup element as context node. It's possible to generate the string, of course, but it requires a call to a named template. (Or, to be more precise: I don't see how to generate it without such a call.)
XSLT 2.0 solution
The problem is relatively straightforward in XSLT 2.0; the following fragments show the crucial bits:
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
<xsl:apply-templates select="node()">
<xsl:sort select="local-name()"/>
<xsl:sort select="string-join(local:key2(.), ' ')"/>
<xsl:sort select="string-join(local:key3(.), ' ')"/>
<xsl:sort select="." data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:function name="local:key2" as="xs:string*">
<xsl:param name="e" as="node()"/>
<xsl:for-each select="$e/@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="local-name()"/>
</xsl:for-each>
</xsl:function>
<xsl:function name="local:key3" as="xs:string*">
<xsl:param name="e" as="node()"/>
<xsl:for-each select="$e/@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="string()"/>
</xsl:for-each>
</xsl:function>
This general approach can also be used in XSLT 1.0 with the EXSLT extension for user-defined functions.
Solution in XSLT 1.0 with EXSLT functions
If your XSLT 1.0 processor supports EXSLT-style user-defined functions, you may be able to do something similar in XSLT 1.0. (My initial attempts failed, but the errors disappeared when I remembered to add the extension-element-prefixes
attribute to the stylesheet element.)
<xsl:stylesheet version="1.0"
extension-element-prefixes="func"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:func="http://exslt.org/functions"
xmlns:local="http://example.com/nss/dummy">
<xsl:output encoding="utf-8"
method="xml"
omit-xml-declaration="yes"
indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
<xsl:apply-templates select="node()">
<xsl:sort select="local-name()"/>
<xsl:sort select="local:key2(.)"/>
<xsl:sort select="local:key3(.)"/>
<xsl:sort select="." data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<func:function name="local:key2">
<xsl:param name="e" select="."/>
<func:result>
<xsl:for-each select="$e/@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(local-name(), ' ')"/>
</xsl:for-each>
</func:result>
</func:function>
<func:function name="local:key3">
<xsl:param name="e" select="."/>
<func:result>
<xsl:for-each select="$e/@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(string(), ' ')"/>
</xsl:for-each>
</func:result>
</func:function>
</xsl:stylesheet>
When run on your input with xsltproc, this produces the desired output.
You might also be able to do something clever in XSLT 1.0 with the node-set extension.
Two-stage pipeline in unextended XSLT 1.0
But the simplest way I can see to solve this problem in unextended XSLT 1.0 is to pipeline two stylesheets together. The first one adds two attributes to every element, to provide sort keys 2 and 3. (Adjust the named templates to make them do what you want.)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:PJ="http://example.com/PankajJaju">
<xsl:output encoding="utf-8"
method="xml"
omit-xml-declaration="yes"
indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:if test="self::*">
<xsl:attribute name="PJ:attribute-names"
namespace="http://example.com/PankajJaju">
<xsl:call-template name="attribute-name-list"/>
</xsl:attribute>
<xsl:attribute name="PJ:attribute-values"
namespace="http://example.com/PankajJaju">
<xsl:call-template name="attribute-value-list"/>
</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template name="attribute-name-list">
<xsl:for-each select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(local-name(), ' ')"/>
</xsl:for-each>
</xsl:template>
<xsl:template name="attribute-value-list">
<xsl:for-each select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="string()"/>
<xsl:value-of select="concat(string(), ' ')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
I've put them into a namespace to reduce the likelihood of name collisions.
The second one uses the sort keys to perform the actual sort and suppresses the temporary attributes.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:PJ="http://example.com/PankajJaju">
<xsl:output encoding="utf-8"
method="xml"
omit-xml-declaration="yes"
indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*">
<xsl:sort select="local-name()"/>
<xsl:sort select="."/>
</xsl:apply-templates>
<xsl:apply-templates select="node()">
<xsl:sort select="local-name()"/>
<xsl:sort select="@PJ:attribute-names"/>
<xsl:sort select="@PJ:attribute-values"/>
<xsl:sort select="."/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="@PJ:attribute-names | @PJ:attribute-values"/>
</xsl:stylesheet>
These can be pipelined together using whatever technology you prefer. Using xsltproc from the bash command line, for example, and assigning the names p1.xsl and p2.xsl to pipeline stylesheets 1 and 2 ...
xsltproc p1.xsl input.xml | xsltproc p2.xsl -
This produces the output you say you want.
Upvotes: 1