Ross
Ross

Reputation: 13

SQL Remove duplicate parent child entries

I have tried looking around for this issue and couldn't find much. I was wondering the best method of removing duplicate parent/child rows in an SQL table.

For example I have a table like so:

Id | CompanyId | DuplicateId
1  |     1     |     2
2  |     2     |     1    <------ CompanyId 2 is already a duplicate of 1
3  |     2     |     3

I wish to select all non duplicates:

Id | CompanyId | DuplicateId
1  |     1     |     2
3  |     2     |     3

Any help or pointing in the right direction would be great. Thanks!

Edit: I'm using Microsoft SQL Server

Upvotes: 1

Views: 831

Answers (3)

Florin
Florin

Reputation: 555

Another way in Sql Server to remove dups parent child could be this:

  delete from your_table where concat(companyid,duplicateid)in
(
SELECT concat(table1.companyid , table1.duplicateid )
FROM your_table table1
    inner join your_table table2
    ON table1.companyid = table2.duplicateid
    AND table1.duplicateid = table2.companyid and  table1.companyid>table1.duplicateid
)

Hope it helps.

Upvotes: 0

Raj Kamuni
Raj Kamuni

Reputation: 388

Try this

SELECT * FROM tablename t1 where CompanyId not IN  (SELECT Duplicateid from  
tablename WHERE CompnayId<>t1.CompanyId and duplicateid > companyid)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270391

Here is one method, that works in most databases:

delete from sqltable
     where duplicateid > companyid and
           exists (select 1
                   from sqltable st2
                   where st2.duplicateid = sqltable.companyid and
                         st2.companyid = sqltable.duplicateid
                  );

Upvotes: 1

Related Questions