Weslor
Weslor

Reputation: 22450

Sorting XML file by element

I want to sort a xml file by elements in the same level alphabetically. That means, sorting the elements on the first level, then inside every element their subelement and so on recursively. It must be multilevel, not only one level (that is solved in other question). For example (please ignore the content and the meaning):

<example>
    <note>
        <to>Tove</to>
        <from>Jani</from>
        <heading>Reminder</heading>
            <headb>head</head3>
            <heada>head</head3>
        <body>Don't forget me this weekend!</body>
    </note>
    <next>
        <c>blabla</c>
        <a>blabla</a>
    </next>
</example>

To:

<example>
    <next>
        <a>blabla</a>
        <c>blabla</c>
    </next>
    <note>
        <body>Don't forget me this weekend!</body>
        <from>Jani</from>
        <heading>Reminder</heading>
            <heada>head</head3>
            <headb>head</head3>
        <to>Tove</to>
    </note>
</example>

The xml can contain thousands of lines and many levels of elements

Upvotes: 1

Views: 717

Answers (1)

Daniel Haley
Daniel Haley

Reputation: 52888

You should be able to use an identity transform and add the xsl:sort to sort on either name() or local-name().

Example...

XML Input (Well formed and slightly more complicated than the original.)

<example>
    <note>
        <to>Tove</to>
        <from>Jani</from>
        <heading>Reminder</heading>
        <headb>head</headb>
        <heada>head</heada>
        <body>Don't forget me this weekend!</body>
    </note>
    <next>
        <c>blabla</c>
        <a>blabla</a>
    </next>
    <djh>
        <!-- comment -->
        <foo attr="test">
            <bar>
                <baz>text</baz>
                <foo><!--comment--></foo>
            </bar>
            <baz attr="test">
                <foo/>
                <bar/>
            </baz>
        </foo>
        <?PI?>
        <baz>
            <bar>
                <foo/>
                <baz/>
            </bar>
            <foo>
                <bar/>
                <baz/>
            </foo>
        </baz>
    </djh>
</example>

XSLT 1.0

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()">
                <xsl:sort select="local-name()"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

XML Output

<example>
   <djh><!-- comment --><?PI?>
      <baz>
         <bar>
            <baz/>
            <foo/>
         </bar>
         <foo>
            <bar/>
            <baz/>
         </foo>
      </baz>
      <foo attr="test">
         <bar>
            <baz>text</baz>
            <foo><!--comment--></foo>
         </bar>
         <baz attr="test">
            <bar/>
            <foo/>
         </baz>
      </foo>
   </djh>
   <next>
      <a>blabla</a>
      <c>blabla</c>
   </next>
   <note>
      <body>Don't forget me this weekend!</body>
      <from>Jani</from>
      <heada>head</heada>
      <headb>head</headb>
      <heading>Reminder</heading>
      <to>Tove</to>
   </note>
</example>

Notice though that comments and processing instructions end up floating to the top of the sort order.

Also note that if you have mixed content (element and text nodes in the same parent), you might want to skip the sorting of that element. Otherwise the text will come first in the sort order (like comments and processing instructions).

Here's a way to skip mixed content elements (comment and processing instruction output does change so you may want to experiment):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>            
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*[not(text())]">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()">
                <xsl:sort select="local-name()"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

Upvotes: 2

Related Questions