Reputation: 89
I've been working on a Geo application. Over the time the product's XML has grown bit messy. The problem arises when synchronizing the changes across multiple environments, like Dev, Test, etc. I'm trying to figure out a way to normalize the content, so I can avoid some cumbersome while editing and merging, and hence, have a productive development. I know it sounds crazy, and there's lot on the background, but let me jump to the actual issue leaving the history.
Here's the issue:
Multiple sorting orders applied, like:
d.c.b.a
as a.b.c.d
or map.google.com
as com.google.map
for sorting.<tgt>
element if present.<scheme>
and <port>
tags when the values are generic, like http / https for scheme tag and 80 or 443 for port tag, otherwise retain. Also, remove if there's no value, like <scheme/>
.Here's a bit of the problematic XML:
XML
<?xml version='1.0' encoding='UTF-8' ?>
<?tapia chrome-version='2.0' ?>
<mapGeo>
<a>blah</a>
<b>blah</b>
<maps>
<mapIndividual>
<src>
<scheme>https</scheme>
<domain>photos.yahoo.com</domain>
<path>somepath</path>
<query>blah</query>
</src>
<loc>C:\var\tmp</loc>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>tcp</scheme>
<domain>map.google.com</domain>
<port>80</port>
<path>/value</path>
<query>blah</query>
</src>
<tgt>
<scheme>https</scheme>
<domain>map.google.com</domain>
<port>443</port>
<path>/value</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>http</scheme>
<domain>*.c.b.a</domain>
<path>somepath</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>somepath</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>http</scheme>
<domain>d.c.b.a</domain>
<path>somepath</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>somepath</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<maps>
</mapGeo>
I was able to apply basic sorting on the values as is, but couldn't figure out a way to generate reverse domain name. I came across XSL extension, but haven't tried yet. Here's the beginning part of the solution I was working on, which is very basic.
XSL
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="maps">
<xsl:copy>
<xsl:apply-templates select="*">
<xsl:sort select="src/domain" />
<xsl:sort select="src/port" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Expected Output
<?xml version='1.0' encoding='UTF-8' ?>
<?tapia chrome-version='2.0' ?>
<mapGeo>
<a>blah</a>
<b>blah</b>
<maps>
<mapIndividual>
<src>
<domain>d.c.b.a</domain>
<path>somepath</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>somepath</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<domain>*.c.b.a</domain>
<path>path1</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>path2</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>tcp</scheme>
<domain>map.google.com</domain>
<path>/value</path>
<query>blah</query>
</src>
<tgt>
<domain>map.google.com</domain>
<path>/value</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<domain>photos.yahoo.com</domain>
<path>somepath</path>
<query>blah</query>
</src>
<loc>C:\var\tmp</loc>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<maps>
</mapGeo>
Note: I'd prefer XSLT 1.0 as that's supported in the current environment. XSLT 2.0 would be a plus.
Update: I figured out solution to support XSLT 2.0 and XSLT 3.0, so please ignore my previous note for XSLT 1.0.
Thank you in Advance!
Cheers,
Upvotes: 0
Views: 128
Reputation: 1882
This XSLT 1.0 stylesheet (without extensions)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="maps">
<xsl:copy>
<xsl:apply-templates select="*">
<xsl:sort
select="translate(src/domain,translate(src/domain,'.',''),'')"
order="descending"/>
<xsl:sort
select="
substring-after(
substring-after(
substring-after(translate(src/domain,'*','~'),'.'),'.'),'.')"/>
<xsl:sort
select="
substring-after(
substring-after(translate(src/domain,'*','~'),'.'),'.')"/>
<xsl:sort
select="substring-after(translate(src/domain,'*','~'),'.')"/>
<xsl:sort select="translate(src/domain,'*','~')" />
<xsl:sort select="src/port" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Output
<?xml version="1.0" encoding="UTF-8"?>
<?tapia chrome-version='2.0' ?>
<mapGeo>
<a>blah</a>
<b>blah</b>
<maps>
<mapIndividual>
<src>
<scheme>http</scheme>
<domain>d.c.b.a</domain>
<path>somepath</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>somepath</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>http</scheme>
<domain>*.c.b.a</domain>
<path>somepath</path>
<port>8085</port>
<query>blah</query>
</src>
<tgt>
<domain>r.q.p</domain>
<path>somepath</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>tcp</scheme>
<domain>map.google.com</domain>
<port>80</port>
<path>/value</path>
<query>blah</query>
</src>
<tgt>
<scheme>https</scheme>
<domain>map.google.com</domain>
<port>443</port>
<path>/value</path>
<query>blah</query>
</tgt>
<x>blah</x>
<y>blah</y>
</mapIndividual>
<mapIndividual>
<src>
<scheme>https</scheme>
<domain>photos.yahoo.com</domain>
<path>somepath</path>
<query>blah</query>
</src>
<loc>C:\var\tmp</loc>
<x>blah</x>
<y>blah</y>
</mapIndividual>
</maps>
</mapGeo>
Do note: this is ussing the fact that .
(dot) precedes and ~
follows (tilde) letters in alphabetical order (at least for US). Also might (sic) not scale well...
I'm with Martin Honnen comment: this would be better solved in XSLT 2.0
Upvotes: 0
Reputation: 116959
I don't think it's possible to sort in the reverse order you seek in a single pass using XSLT 1.0. Consider the following simplified example:
XML
<root>
<item>
<domain>t.q.p</domain>
</item>
<item>
<domain>s.q.p</domain>
</item>
<item>
<domain>photos.yahoo.com</domain>
</item>
<item>
<domain>map.google.com</domain>
</item>
<item>
<domain>aap.google.com</domain>
</item>
<item>
<domain>r.q.p</domain>
</item>
<item>
<domain>*.c.b.a</domain>
</item>
<item>
<domain>d.c.b.a</domain>
</item>
</root>
XSLT 1.0 (+ EXSLT node-set)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/root">
<!-- 1st pass -->
<xsl:variable name="items">
<xsl:for-each select="item">
<xsl:copy>
<xsl:attribute name="sort-string">
<xsl:call-template name="reverse-tokens">
<xsl:with-param name="text" select="domain"/>
</xsl:call-template>
</xsl:attribute>
<xsl:copy-of select="@*|node()"/>
</xsl:copy>
</xsl:for-each>
</xsl:variable>
<!-- output -->
<xsl:copy>
<xsl:apply-templates select="exsl:node-set($items)/item">
<xsl:sort select="@sort-string" data-type="text" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="@sort-string"/>
<xsl:template name="reverse-tokens">
<xsl:param name="text"/>
<xsl:param name="delimiter" select="'.'"/>
<xsl:variable name="token" select="substring-before(concat($text, $delimiter), $delimiter)"/>
<xsl:if test="contains($text, $delimiter)">
<!-- recursive call -->
<xsl:call-template name="reverse-tokens">
<xsl:with-param name="text" select="substring-after($text, $delimiter)"/>
</xsl:call-template>
<xsl:value-of select="$delimiter"/>
</xsl:if>
<xsl:choose>
<xsl:when test="$token = '*'">
<xsl:text>zzzz</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$token"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
<domain>d.c.b.a</domain>
</item>
<item>
<domain>*.c.b.a</domain>
</item>
<item>
<domain>aap.google.com</domain>
</item>
<item>
<domain>map.google.com</domain>
</item>
<item>
<domain>photos.yahoo.com</domain>
</item>
<item>
<domain>r.q.p</domain>
</item>
<item>
<domain>s.q.p</domain>
</item>
<item>
<domain>t.q.p</domain>
</item>
</root>
Upvotes: 0