user2902426
user2902426

Reputation: 77

XSLT 1.0 - Removing duplicates based on an attribute value in a XML file

I need some help with removing duplicate entires via XSLT 1.0. I have read all possible answers regarding this topic including all suggested approaches on Stackoverflow (also https://www.jenitennison.com/xslt/grouping/muenchian.xml), but I am not able to figure out a solution how to transform it via XSLT 1.0.

This is (a part of) the source XML:

<?xml version="1.0" encoding="UTF-8"?>
<A>
  <B>
    <D id="PK-134" name="BBO" XorY="ADV" />
    <D id="PK-46" name="BCAMM" XorY="ADV" />
    <D id="PK-46" name="BCAmm" XorY="ADV" />
    <D id="PK-425" name="Berta" XorY="ADV" />
    <D id="PK-425" name="WWERTA" XorY="ADV" />
    <D id="PK-425" name="Werta (BW)" XorY="ADV" />
    <D id="PK-1392" name="DDex Analyzer" XorY="ADV" />
    <D id="PK-1392" name="Ddex Analyzer" XorY="ADV" />
    <D id="PK-605" name="KL DB" XorY="ADV" />
    <D id="PK-605" name="KL DB (BW)" XorY="ADV" />
    <D id="PK-142" name="CS" XorY="ADV" />
    <D id="PK-142" name="CS (FS)" XorY="ADV" />
    <D id="PK-142" name="CS FS" XorY="ADV" />
    <D id="PK-142" name="CS FS-DE" XorY="ADV" />
  </B>
</A>

The desired output would be:

(Remark: the first node found, from the source XML, should be added in the target XML and should be also the relevant one e.g. <D id="PK-46" name="BCAMM" XorY="ADV" /> or <D id="PK-142" name="CS" XorY="ADV" />)

<?xml version="1.0" encoding="UTF-8"?>
    <A>
      <B>
        <D id="PK-134" name="BBO" XorY="ADV" />

        <D id="PK-46" name="BCAMM" XorY="ADV" />            
        <D id="PK-425" name="Berta" XorY="ADV" />            
        <D id="PK-1392" name="DDex Analyzer" XorY="ADV" />            
        <D id="PK-605" name="KL DB" XorY="ADV" />            
        <D id="PK-142" name="CS" XorY="ADV" />            
      </B>
    </A>

The duplicates should be removed based on the "id" attribute inside of the D Node.

Here is my approach:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml"
              version="1.0"
              encoding="utf-8"
              indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!--
  Identity transform: copy elements and attributes from input file as is
  -->
  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Drop <D> elements with a preceding <D> sibling that has the same @id attribute value as the current element -->
  <xsl:template match="D[preceding-sibling::D[@id = current()/@id]]"/>

</xsl:stylesheet>

Unfortunately I am receiving following error message (Visual Studio 2017):

error: The 'current()' function cannot be used in a pattern.

Thanks a lot in advance & regards!

Upvotes: 1

Views: 77

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167716

To make use of the identity transformation and a key you basically need

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:key name="dup" match="D" use="@id"/>
  
  <xsl:template match="D[not(generate-id() = generate-id(key('dup', @id)[1]))]"/>

Upvotes: 1

Related Questions