Reputation: 573
in my query http://www.sqlfiddle.com/#!9/16b9c2/3/0
SELECT f.name, COUNT(s.letters), AVG(t.numbers)
FROM first f
INNER JOIN second s
ON s.first_id = f.id AND s.letters IN ('aa', 'bb', 'cc', 'dd')
LEFT JOIN third t
ON t.first_id = f.id
GROUP BY f.id
before joining third table the count was 4 - 1 but after joining third table it became 16 - 1 which is false because it must be 4 - 1
Upvotes: 1
Views: 254
Reputation: 1270081
Your average may also be subtly off -- or at least require additional unnecessarily calculation.
The reason is that the number of rows is being multiplied along two dimensions. The more accurate solution is to aggregate before doing the join
:
SELECT f.name, s.num_letters, AVG(t.numbers)
FROM first f INNER JOIN
(SELECT s.first_id, COUNT(*) as num_letters
FROM second s
WHERE s.letters IN ('aa', 'bb', 'cc', 'dd')
GROUP BY s.first_id
) s
ON s.first_id = f.id LEFT JOIN
third t
ON t.first_id = f.id
GROUP BY f.id, s.num_letters;
For such queries it might be best to move all the aggregation into subqueries:
SELECT f.name, s.num_letters, t.avg_numbers
FROM first f INNER JOIN
(SELECT s.first_id, COUNT(*) as num_letters
FROM second s
WHERE s.letters IN ('aa', 'bb', 'cc', 'dd')
GROUP BY s.first_id
) s
ON s.first_id = f.id LEFT JOIN
(SELECT t.first_id, AVG(t.numbers) as avg_numbers
FROM third t
GROUP BY t.first_id
) t
ON t.first_id = f.id;
Upvotes: 2
Reputation: 1645
A bit ugly but gets the answer: http://www.sqlfiddle.com/#!9/16b9c2/26
SELECT s2.name, s2.letters, AVG(t.numbers)
FROM (
select f.id, f.name as name, COUNT(s.letters) as letters
from first f
JOIN second s
ON s.first_id = f.id AND s.letters IN ('aa', 'bb', 'cc', 'dd')
GROUP BY f.id) s2
JOIN third t
ON t.first_id = s2.id
group by s2.name;
Whether it's a left join or join or inner join in your case makes no difference since every value is joinable. The "fanout" occurs because the numbers and letters are effectively cross joined since there are no join conditions on them.
The solution suggested does two group by's which gets rid of the crossing by aggregating at every level.
Upvotes: 1
Reputation: 425083
Each row returned is counted, so multiple successful joins from f
to s
will then be joined again for each successful jointo t
.
To fix, use count(distinct ...)
:
SELECT f.name, COUNT(distinct s.letters), AVG(t.numbers)
...
Upvotes: 0