Mohit
Mohit

Reputation: 25

Remove duplicate record based on condition from xml file using xslt

If a record with duplicate Emplid is coming in input xml then i want to delete a record which is having status as 'withdrawn'. I want to only keep the record whose status is active if record with same emplid coming twice.

Input xml

<Recordset>
<Record>
    <Emplid>10001</Emplid>
    <name>Bob Dylan</name>
    <country>USA</country>
    <company>Columbia</company>
    <status>active</status>
    <year>1985</year>
</Record>
<Record>
    <Emplid>10002</Emplid>
    <name>Bonnie Tyler</name>
    <country>UK</country>
    <company>CBS Records</company>
    <status>withdrawn</status>
    <year>1988</year>
</Record>
<Record>
    <Emplid>10001</Emplid>
    <name>Bob Dylan</name>
    <country>Uk</country>
    <company>CBS Records</company>
    <status>withdrwan</status>
    <year>1975</year>
</Record>
</Recordset>

expected xml

Recordset>
<Record>
    <Emplid>10001</Emplid>
    <name>Bob Dylan</name>
    <country>USA</country>
    <company>Columbia</company>
    <status>active</status>
    <year>1985</year>
</Record>
<Record>
    <Emplid>10002</Emplid>
    <name>Bonnie Tyler</name>
    <country>UK</country>
    <company>CBS Records</company>
    <status>withdrawn</status>
    <year>1988</year>
</Record>
</Recordset>

Appreciate if someone could help me.

Upvotes: 0

Views: 158

Answers (2)

michael.hor257k
michael.hor257k

Reputation: 116993

So basically you want to copy all active records, and also any withdrawn records that do not have an active record with the same Emplid?

yes exactly

Well, then why not do exactly that:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="act" match="Record[status='active']" use="Emplid" />

<xsl:template match="/Recordset">
    <xsl:copy>
        <xsl:copy-of select="Record[status='active']"/>
        <xsl:copy-of select="Record[status='withdrawn'][not(key('act', Emplid))]"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

If you want to keep the original order, you can combine the two xsl:copy-of instructions into one:

        <xsl:copy-of select="Record[status='active'] | Record[status='withdrawn'][not(key('act', Emplid))]"/>

Note the use of a key to resolve cross-references.

Upvotes: 1

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22187

You can try the following XSLT.

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="utf-8" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Record[status='withdrawn']">
        <xsl:variable name="Emplid" select="./Emplid"/>
        <xsl:choose>
            <xsl:when test="count(/Recordset/Record[Emplid=$Emplid]) &gt; 1">
            </xsl:when>
            <xsl:when test="count(/Recordset/Record[Emplid=$Emplid]) = 1">
                <xsl:copy-of select="."/>
            </xsl:when>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 0

Related Questions