Reputation: 46591
When I run the following queries with SQL_CALC_FOUND_ROWS
or specify all the colors RED,BLUE,YELLOW
in the sub-query using an IN
clause, I get a record count of 198
. But when I do each color individually, I get a COUNT
that adds up to 248 (72 + 59 + 118)
. Where are the extra 50 rows coming from? I would have thought that if I did them individually and then added them up, I would get 198
, but I am getting 248
. Am I viewing how the IN clause works incorrectly? I would prefer to use COUNT
instead of SQL_CALC_FOUND_ROWS
because I have found SQL_CALC_FOUND_ROWS
to be much slower (+3 seconds)
Here are few stats:
1. I am running MySQL Version: 5.5.2
2. The tables engine is INNODB.
3. All the `CHECK TABLE .... EXTENDED` return OK.
-- Returns 198
SELECT SQL_CALC_FOUND_ROWS DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW','RED','BLUE');
SELECT FOUND_ROWS();
-- Returns 198
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW','RED','BLUE')) all;
-- Returns 72
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('RED')) red;
-- Returns 59
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('BLUE')) blue;
-- Returns 118
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW')) yellow;
Upvotes: 0
Views: 54
Reputation: 26343
Something like this is happening:
Persons
table
ID Name
-- ------------
1 John Doe
2 Xaisoft
Colors
table
PersonID ColorName
-------- ---------
1 Red
1 Yellow
2 Yellow
Now query Persons.*
for John Doe
while joining to Colors
:
SELECT p.*
FROM Persons p
INNER JOIN Colors C ON c.PersonID = p.ID
WHERE p.Name = 'John Doe';
Your result will be one p.*
row for each color under John Doe:
ID Name
-- ------------
1 John Doe
1 John Doe
If you apply a COUNT DISTINCT
against this you'll end up with 1
.
If you count Red
and Yellow
separately, you'll get one for each query. Add them up and you'll get 2
.
Look at it this way: if you throw out the COUNT
and just SELECT DISTINCT Persons.*
you get these results:
Red or Yellow for John Doe:
SELECT DISTINCT p.*
FROM Persons p
INNER JOIN Colors C ON c.PersonID = p.ID
WHERE p.Name = 'John Doe'
AND c.ColorName IN ('RED', 'YELLOW');
ID Name
-- ----------
1 John Doe
Red only for John Doe:
SELECT DISTINCT p.*
FROM Persons p
INNER JOIN Colors C ON c.PersonID = p.ID
WHERE p.Name = 'John Doe'
AND c.ColorName IN ('RED');
ID Name
-- ----------
1 John Doe
Yellow only for John Doe:
SELECT DISTINCT p.*
FROM Persons p
INNER JOIN Colors C ON c.PersonID = p.ID
WHERE p.Name = 'John Doe'
AND c.ColorName IN ('YELLOW');
ID Name
-- ----------
1 John Doe
Upvotes: 1