Reputation: 1007
I have a table T1 with columns id, C1, C2 and C3. I am using the following query to find duplicate records
Select group_concat(id) from T1 group by C2 having count(id) >1;
Now I want to group all the duplicate records by column C3. How do I do it? Note : I am not expecting
Select group_concat(id) from T1 group by C2,C3 having count(id) >1;
I want to get all the records having duplicate values on C2 and group them only based on C3 irrespective of their C2 value
id C1 C2 C3
1 a 3 A
2 b 2 A
3 c 2 A
4 d 2 B
5 e 3 C
In the above data 1,5 are duplicate record with C2 value 3 and 2,3,4 is duplciate records with C2 value 2. I want an output
A - has 2 duplicates (with C2 values 2 and 3 )
B - has 1 duplicate (with C2 value 2)
C - has 1 duplicate (with C2 value 3)
Upvotes: 1
Views: 1034
Reputation: 28834
GROUP BY
on C2
and identify their count(s). C2
value having count more than 1 is basically a duplicate (occurring in more than one row).C2
. This will help us in getting an additional column showing C2
count against every row.C3
using COUNT(DISTINCT ...)
, considering those cases where count is more than 1.Try:
SELECT
t.C3,
COUNT(DISTINCT IF(dt.count_C2 > 1, t.C2, NULL)) AS duplicates
FROM
your_table AS t
JOIN
(
SELECT
C2,
COUNT(id) AS count_C2
FROM your_table
GROUP BY C2
) AS dt
ON dt.C2 = t.C2
GROUP BY t.C3
Result
| C3 | duplicates |
| --- | ---------- |
| A | 2 |
| B | 1 |
| C | 1 |
Upvotes: 1
Reputation: 6441
SELECT GROUP_CONCAT(id)
FROM T1
WHERE C2 IN
(
SELECT C2
FROM T1
GROUP BY C2
HAVING COUNT(id)>1
)
GROUP BY C3
Upvotes: 1