amallard
amallard

Reputation: 1229

How to SELECT every row that has a duplicate value of a field

Using PostgreSQL, I'm trying to find a way to select every row that duplicate values for a certain column.

For example, my table would look like this:

id   | username | email
1    | abc      | [email protected]
2    | abc1     | [email protected]
3    | def      | [email protected]
4    | ghi      | [email protected]
5    | ghi1     | [email protected]

And my desired output would select the username and the email, where email count > 2:

abc   | [email protected]
abc1  | [email protected]
ghi   | [email protected]
ghi1  | [email protected]

I have tried group by having, and that gets me close to what I want, but I don't think I want to use group by because that will actually combine the rows with duplicate values, I still want to show the separate rows that contain duplicate values.

SELECT email FROM auth_user
GROUP BY email HAVING count(*) > 1;

That only shows me the emails that have duplicate values:

[email protected]
[email protected]

I can include the count in there with SELECT email, count(*) FROM ... but that's not what I want either.

I'm thinking I want something like where count(email) > 1 but that gives me an error saying ERROR: aggregate functions are not allowed in WHERE

How can I select duplicate values without grouping them?

Update with Solution:

@GordonLinoff posted the correct answer. But to match my exact needs of only obtaining username and email fields, I have modified his a little bit (which should be self explanatory, but posting in case anyone else needs the exact query)

select username, email
from (select username, email, count(*) 
      over (partition by email) as cnt
      from auth_user au
) au
where cnt > 1;

Upvotes: 0

Views: 80

Answers (2)

Cyril Damm
Cyril Damm

Reputation: 343

You might find this helpful too:

select t1.*, t2.*
from auth_user t1, auth_user t2
where t1.id != t2.id
and t1.email = t2.email

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

If you want all the original rows, then I would suggest using count(*) as a window function:

select au.*
from (select au.*, count(*) over (partition by email) as cnt
      from auth_user au
     ) au
where cnt > 1;

Upvotes: 3

Related Questions