Reputation: 145
I have a scenario where we need to identify duplicate records based on two different columns.For ex. if there are duplicate records present in COLA those records should not be considered and same case with COLB. But while taking duplicates we should not consider COLA and COLB together. Duplicate should be considered separately. My input table is given below.
Input table
Expected Output.
I was thinking of writing a group by query using two separate queries and then do a UNION.But just want to check any other easy way to find it.
Upvotes: 0
Views: 139
Reputation: 290
This would work for your scenario.
select COLA,COLB,COLD from test
where COLB not in (select COLB from test group by COLB having (count(COLB) >1))
group by COLA having (count(COLA) =1);
Upvotes: 0
Reputation: 521103
We could try using exists logic here:
SELECT COLA, COLB, COLD
FROM yourTable t1
WHERE
NOT EXISTS (SELECT 1 FROM yourTable t2
WHERE (t2.COLA = t1.COLA AND t2.PK <> t1.PK) OR
(t2.COLB = t1.COLB AND t2.PK <> t1.PK));
I am assuming here that your table has some primary key column PK
. If it doesn't have one, add it. You might be able to use COLD
in place of PK
, assuming that COLD
values would always be unique.
Upvotes: 0
Reputation: 50163
You can use window function :
select t.*
from (select t.*,
count(*) over (partition by cola) as cola_cnt,
count(*) over (partition by colb) as colb_cnt
from table t
) t
where cola_cnt = 1 and colb_cnt = 1;
Upvotes: 1