Reputation: 379
I have a database of names, some of the names haven't been insert in the correct fashion. SecondName has sometimes been entered as FirstName.
+-----------------+--------------+
| FirstName | SecondName |
+-----------------+--------------+
| Bob | Smith |
| Gary | Rose |
| John | Jones |
| Smith | Bob |
| Gary | Oberstein |
| Adam | Sorbet |
| Jones | John |
+-----------------+--------------+
I've tried different grouping queries
select `FirstName`
, `SecondName`
from `names`
where ( `FirstName`
, `SecondName` )
in ( select `FirstName`
, `SecondName`
from `names`
group
by `FirstName`
, `SecondName`
having count(*) > 1
)
But I can't get anything to produce
+-----------------+--------------+---------+
| FirstName | SecondName | Count |
+-----------------+--------------+---------+
| Bob | Smith | 2 |
| John | Jones | 2 |
+-----------------+--------------+---------+
Upvotes: 1
Views: 47
Reputation: 10277
If the COUNT()
itself isn't important, you can easily separate duplicates with an INNER JOIN
SELECT n.FirstName, n.SecondName, n2.FirstName, n2.SecondName
FROM Names n
INNER JOIN Names n2 on n.FirstName = n2.SecondName and n.SecondName = n2.FirstName
Upvotes: 2
Reputation: 70523
There is a trick to do this, you need to normalize your names, a quick way to do this is if you alphabetize first name and last name then group on the result.
SELECT name_normalized, count(*) as c
FROM (
SELECT CASE WHEN FIRSTNAME < LASTNAME THEN FIRSTNAME||LASTNAME
ELSE LASTNAME|| FIRSTNAME END as name_normalized
FROM names
) X
GROUP BY name_normalized
Notes:
Upvotes: 4