Reputation: 145

Query for finding duplicates based on two columns

I have a scenario where we need to identify duplicate records based on two different columns.For ex. if there are duplicate records present in COLA those records should not be considered and same case with COLB. But while taking duplicates we should not consider COLA and COLB together. Duplicate should be considered separately. My input table is given below.

Input table

Expected Output.

I was thinking of writing a group by query using two separate queries and then do a UNION.But just want to check any other easy way to find it.

Upvotes: 0

Answers (3)

Wasim Ansari

Reputation: 290

This would work for your scenario.

select COLA,COLB,COLD  from test 
where COLB not in (select COLB  from test group by COLB having (count(COLB) >1))
group by COLA having (count(COLA) =1);

Upvotes: 0

Tim Biegeleisen

Reputation: 521103

We could try using exists logic here:

SELECT COLA, COLB, COLD
FROM yourTable t1
WHERE
    NOT EXISTS (SELECT 1 FROM yourTable t2
                WHERE (t2.COLA = t1.COLA AND t2.PK <> t1.PK) OR
                      (t2.COLB = t1.COLB AND t2.PK <> t1.PK));

I am assuming here that your table has some primary key column PK. If it doesn't have one, add it. You might be able to use COLD in place of PK, assuming that COLD values would always be unique.

Upvotes: 0

Yogesh Sharma

Reputation: 50163

You can use window function :

select t.*
from (select t.*, 
             count(*) over (partition by cola) as cola_cnt,
             count(*) over (partition by colb) as colb_cnt
      from table t
     ) t
where cola_cnt = 1 and colb_cnt = 1;

Upvotes: 1

Query for finding duplicates based on two columns

Answers (3)

Related Questions