Reputation: 1229
Using PostgreSQL, I'm trying to find a way to select every row that duplicate values for a certain column.
For example, my table would look like this:
id | username | email
1 | abc | [email protected]
2 | abc1 | [email protected]
3 | def | [email protected]
4 | ghi | [email protected]
5 | ghi1 | [email protected]
And my desired output would select the username and the email, where email count > 2:
abc | [email protected]
abc1 | [email protected]
ghi | [email protected]
ghi1 | [email protected]
I have tried group by having
, and that gets me close to what I want, but I don't think I want to use group by
because that will actually combine the rows with duplicate values, I still want to show the separate rows that contain duplicate values.
SELECT email FROM auth_user
GROUP BY email HAVING count(*) > 1;
That only shows me the emails that have duplicate values:
[email protected]
[email protected]
I can include the count in there with SELECT email, count(*) FROM ...
but that's not what I want either.
I'm thinking I want something like where count(email) > 1
but that gives me an error saying ERROR: aggregate functions are not allowed in WHERE
How can I select duplicate values without grouping them?
Update with Solution:
@GordonLinoff posted the correct answer. But to match my exact needs of only obtaining username and email fields, I have modified his a little bit (which should be self explanatory, but posting in case anyone else needs the exact query)
select username, email
from (select username, email, count(*)
over (partition by email) as cnt
from auth_user au
) au
where cnt > 1;
Upvotes: 0
Views: 80
Reputation: 343
You might find this helpful too:
select t1.*, t2.*
from auth_user t1, auth_user t2
where t1.id != t2.id
and t1.email = t2.email
Upvotes: 0
Reputation: 1269443
If you want all the original rows, then I would suggest using count(*)
as a window function:
select au.*
from (select au.*, count(*) over (partition by email) as cnt
from auth_user au
) au
where cnt > 1;
Upvotes: 3