ejunker
ejunker

Reputation: 11011

MySQL COUNT(DISTINCT()) unexpected results

I'm using MySQL 5.0.45 on CentOS 5.1.

SELECT DISTINCT(email) FROM newsletter

Returns 217259 rows

SELECT COUNT(DISTINCT(email)) FROM newsletter

Returns 180698 for the count.

SELECT COUNT(*) FROM (SELECT DISTINCT(email) FROM newsletter) AS foo

Returns 180698 for the count.

Shouldn't all 3 queries return the same value?

Here is the schema of the newsletter table

CREATE TABLE `newsletter` (
  `newsID` int(11) NOT NULL auto_increment,
  `email` varchar(128) NOT NULL default '',
  `newsletter` varchar(8) NOT NULL default '',
  PRIMARY KEY  (`newsID`)
) ENGINE=MyISAM;

Update: I've found that if I add a WHERE clause to the first query then I get the correct results. The WHERE clause is such that it will not effect the results.

SELECT DISTINCT(email) FROM newsletter WHERE newsID > 0

Upvotes: 3

Views: 3657

Answers (5)

Dave Costa
Dave Costa

Reputation: 48111

One possible explanation is that rows were deleted from the table between your two queries. But I assume you've run these multiple times in various orders so I doubt that's it.

What rowcount do you get from this query:

SELECT email FROM newsletter GROUP BY email;

Upvotes: 0

Leandro López
Leandro López

Reputation: 2185

Yes, as Maglob said you may be getting some NULLs in your email column. Try with something like:

SELECT COUNT(COALESCE(email, 0)) FROM newsletter

That would give you the results you expect. Oh, wait. The definition says that email does not accept NULL values. If that is truly the case, then it is odd.

Upvotes: 0

Maglob
Maglob

Reputation: 1583

My guess is that there are some null values in email column. Try

select count(*) from newsletter where email is null;

Upvotes: 2

Paul Tomblin
Paul Tomblin

Reputation: 182792

In the first case, what happens if you change it to "select distinct(email) from newsletter order by email;" I don't think it should make a difference, but it might.

Upvotes: 0

John Boker
John Boker

Reputation: 83709

could you possibly intersect the differing results to see what the extra rows are?

Upvotes: 2

Related Questions