Reputation: 16242
I want to find all users whose name appears at least twice in my User table. 'email' is a unique field, but the combination of 'firstName' and 'lastName' is not necessarily unique.
So far I have come up with the following query, which is very slow, and I am not even sure it is correct. Please let me know a better way to rewrite this.
SELECT CONCAT(u2.firstName, u2.lastName) AS fullName
FROM cpnc_User u2
WHERE CONCAT(u2.firstName, u2.lastName) IN (
SELECT CONCAT(u2.firstName, u2.lastName) AS fullNm
FROM cpnc_User u1
GROUP BY fullNm
HAVING COUNT(*) > 1
)
Also, note that the above returns the list of names that appear at least twice (I think so, anyway), but what I really want is the complete list of all user 'id' fields for these names. So each name, since it appears at least twice, will be associated with at least two primary key 'id' fields.
Thanks for any help! Jonah
Upvotes: 4
Views: 3318
Reputation: 115660
SELECT u.id
, CONCAT(u.firstName, ' ', u.lastName) AS fullname
FROM cpnc_User u
JOIN
( SELECT min(id) AS minid
, firstName
, lastName
FROM cpnc_User
GROUP BY firstName, lastName
HAVING COUNT(*) > 1
) AS grp
ON u.firstName = grp.firstName
AND u.lastName = grp.lastName
ORDER BY grp.minid
, u.id
The ORDER BY grp.minid
ensures that users with same first and last name stay grouped together in the output.
Upvotes: 2
Reputation: 107826
SELECT u.*
FROM cpnc_User u JOIN
(
SELECT firstName, lastName
FROM cpnc_User
GROUP BY firstName, lastName
HAVING COUNT(*) > 1
) X on X.firstName = u.firstName AND x.lastName = u.lastName
ORDER BY u.firstName, u.lastName
There is no need to make up a concatenated field, just use the 2 fields separately
Upvotes: 7
Reputation: 3722
To experiment I created a simple table with two columns a user id, and a name. I inserted a bunch of records, including some duplicates. Then ran this query:
SELECT
count(id) AS count,
group_concat(id) as IDs
FROM
test
GROUP BY
`name`
ORDER BY
count DESC
It should give you results like this:
+-------+----------+
| count | IDs |
+-------+----------+
| 4 | 7,15,4,1 |
| 2 | 2,8 |
| 2 | 6,13 |
| 2 | 14,9 |
| 1 | 11 |
| 1 | 10 |
| 1 | 3 |
| 1 | 5 |
| 1 | 17 |
| 1 | 12 |
| 1 | 16 |
+-------+----------+
You'll need to filter out the later results using something else.
Upvotes: 2
Reputation: 60413
SELECT u.id, u.firstName, u.lastName
FROM cpnc_User u, (
SELECT uc.firstName, uc.lastName
FROM cpnc_User uc
GROUP BY uc.firstName, uc.lastName
HAVING count(*) > 1
) u2
WHERE (
u.firstName = u2.firstName
AND u.lastName = u2.lastName
)
Upvotes: 3
Reputation: 18012
OK, you are doing a concatenation, then doing a compare on this, which essentially means that the DB is going to have to do something to every single row of the database.
How about a slightly different approach, you are holding surname and first name separately. So first select all those instances where surname appears > 1 time in your database. Now this has cut your population down dramatically.
Now you can do a compare on the first name to find out where the matches are.
Upvotes: 1