Reputation: 327
Sample xml is as below:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<check>
<val>
<Samsung>
<name value="galaxy"/>
<name value="galaxy"/>
<name value="galaxys"/>
<id value="123"/>
<id value="123"/>
<name2 value="galaxy"/>
</Samsung>
<htc>
<name value="galaxy"/>
<name value="galaxy"/>
<name value="galaxys"/>
<id value="123"/>
<id value="123"/>
<name2 value="galaxy"/>
</htc>
</val>
</check>
How can I remove the duplicates?
<name>
and <id>
tags with matching values too... Also if there are more tags other than <Samsung>
and <htc>
, how to write loop in xslt? I have no idea how to write xslt. Please help.
The output xml should look like:
<check>
<val>
<Samsung>
<name value="galaxy"/>
<name value="galaxys"/>
<id value="123"/>
<name2 value="galaxy"/>
</Samsung>
<htc>
<name value="galaxy"/>
<name value="galaxys"/>
<id value="123"/>
<name2 value="galaxy"/>
</htc>
</val>
</check>
Upvotes: 0
Views: 4579
Reputation: 70638
If you can ensure the duplicate nodes are always consecutive, then the simplest way to do this is build upon the XSTL Identity Transform an just have an extra template to strip out the templates like so
<xsl:template
match="*[not(*)]
[name() = preceding-sibling::*[1]/name()]
[@value = preceding-sibling::*[1]/@value]" />
This matches any child element, and ignores it if it has the same name and value as the previous element. There is no need to hard-code an element name anywhere in this case.
Here is the full XSLT in this case
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(*)][name() = preceding-sibling::*[1]/name()][@value = preceding-sibling::*[1]/@value]" />
</xsl:stylesheet>
However this would fail if your XML looked like this, and your duplicate nodes were consecutive
<Samsung>
<name value="galaxy"/>
<name value="galaxys"/>
<id value="123"/>
<name value="galaxy"/>
<id value="123"/>
<name2 value="galaxy"/>
</Samsung>
You could fix this by changing the template to check back all previous nodes
<xsl:template match="*[not(*)]
[name() = preceding-sibling::*/name()]
[@value = preceding-sibling::*/@value]" />
However, this starts to become inefficient with large numbers of elements. If you have hundred of elements, then each precedinig-sibling check will repeatedly involve checking hundred of elements (i.e the 100th element has to check 99 preceding ones, the 101th element checks 100 ones, etc).
A more efficient method (in XSLT1.0) is to use a technique called Muenchian Grouping. It is certainly something worth learning about if you use XSLT a lot.
First you define a key to 'group' your elements. In this case, your are looking for distinct elements defined by their parent, element name, and value
<xsl:key name="duplicate" match="*[not(*)]" use="concat(generate-id(..), '|', name(), '|', @value)" />
Then to ignore the duplicates, you match any element that doesn't occur in the first position in the key for the given 'lookup' value
<xsl:template match="*[not(*)]
[generate-id() !=
generate-id(key('duplicate', concat(generate-id(..), '|', name(), '|', @value))[1])]" />
Here is the full XSLT in this case
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:key name="duplicate" match="*[not(*)]" use="concat(generate-id(..), '|', name(), '|', @value)" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(*)][generate-id() != generate-id(key('duplicate', concat(generate-id(..), '|', name(), '|', @value))[1])]" />
</xsl:stylesheet>
Upvotes: 2
Reputation: 9627
Nearly the same as in answer from @siva2012. But more correct if there is only one name child.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" />
<xsl:template match="name">
<xsl:variable name="text" select="text()"/>
<xsl:if test="not(following-sibling::name[text()= $text])" >
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
Upvotes: 0
Reputation: 459
When this transformation
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="name">
<xsl:if test="self::name/text()= following-sibling::name/text()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="name2">
<xsl:if test="self::name2/text()= following-sibling::name2/text()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
runs on below XML
<?xml version="1.0"?>
<check>
<val>
<sai>
<name> A</name>
<name> A</name>
<name2> B</name2>
<name2> B</name2>
</sai>
<dinesh>
<name> A</name>
<name> A</name>
<name2> B</name2>
<name2> B</name2>
</dinesh>
</val>
</check>
gets the required output
<?xml version='1.0' ?>
<check>
<val>
<sai>
<name> A</name>
<name2> B</name2>
</sai>
<dinesh>
<name> A</name>
<name2> B</name2>
</dinesh>
</val>
</check>
Upvotes: 1