Reputation: 51
I have a Customer table which contains an ID and Email field. I've written the following query to return all duplicate Customers with the same Email:
SELECT ID, Email
FROM Customer a
WHERE EXISTS (SELECT 1
FROM Customer b
WHERE a.Email = b.Email
GROUP BY Email
HAVING COUNT(Email) = 2)
ORDER BY Email
This is returning records that look like the following:
ID Email
1 [email protected]
2 [email protected]
3 [email protected]
4 [email protected]
While this works, I actually need the data in the following format:
ID1 Email1 ID2 Email2
1 [email protected] 2 [email protected]
3 [email protected] 4 [email protected]
What is the best way to achieve this?
Upvotes: 0
Views: 70
Reputation: 2976
Try:
SELECT MIN(ID) ID, Email, MAX(ID) ID2, Email AS EMAIL2
FROM Customer GROUP BY Email
if you want HAVING COUNT(Email) = 2, it will be like this
SELECT MIN(ID) ID, Email, MAX(ID) ID2, Email AS EMAIL2
FROM Customer GROUP BY Email
HAVING COUNT(Email) = 2
Upvotes: 0
Reputation: 264
Your layout assumes that you can only have a total of 2 duplicates.
Maybe list the IDs instead like below?
declare @Duplicates table (Email varchar(50), Customers varchar(100))
insert @Duplicates select Email, '' from Customer group by Email having count(*) > 1
UPDATE d
SET
Customers= STUFF(( SELECT ','+ cast(ID as varchar(10))
FROM Customer c
WHERE c.Email = d.Email
FOR XML PATH(''), TYPE).value('.','VARCHAR(max)'), 1, 1, '')
FROM @Duplicates AS d
select * from @Duplicates
order by Email
Upvotes: 0
Reputation: 1269563
One method is conditional aggregation . . . assuming you have at most two emails:
select max(case when seqnum = 1 then id end) as id_1,
email as email_1,
max(case when seqnum = 2 then id end) as id_2,
email as email_2
from (select t.*, row_number() over (partition by email order by id) as seqnum
from t
) t
group by email;
Actually, why not just do:
select email, count(*) as num_dups, min(id) as id_1,
(case when count(*) > 1 then max(id) end) as id_2
from t
group by email;
Upvotes: 1