Finding statistics about duplicate emails in MySQL

Question

I need to query statistics about duplicate values in my database table. For example, say that I have an e-mail field and multiple rows can have the same e-mail. I know want to know is how many addresses is repeated how many times. In other words: "908 emails are repeated 10 times, 1783 emails are repeated 9 times" and so on.

Repeated       # of Emails
10             908
9              1783

I don't need to see the actual e-mail addresses, just these statistics.

Right know I have this query which also retrieves the e-mail address:

select email_address,
count(email_address) as NumberOccurrences
from table_user_info
group by email_address
having ( count(email_address) > 1 )

How do I group these results?

Michael Berkowski · Accepted Answer

An aggregate COUNT() with a subquery also returning an aggregate COUNT() will provide this. The subquery groups and counts per email address, as in abc@example.com - 10, and the outer query then counts and groups by the number of repeats returned by the subquery, discarding the actual email addresses.

SELECT
  repeated,
  COUNT(*) AS numemails
FROM (
  SELECT 
     email,
     COUNT(*)
  FROM emails
  GROUP BY email
) emailcounts

Finding statistics about duplicate emails in MySQL

Answers (2)

Related Questions