Reputation: 48500
I have a query that looks like the following,
SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login
FROM stats a, users b
WHERE a.user_id = b.user_id
AND a.active = 1
GROUP BY
a.user_id
HAVING
a.day = days_since_login
The problem I have is that if the stats
table has more than one record for the user, zero rows return. If there is exactly one row, it'll return it.
I think my use of HAVING
is incorrect here. The query planner seems to only filter based on the result set from the WHERE clause. Ideally, I'd have this illegal query:
SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login
FROM stats a, users b
WHERE a.user_id = b.user_id
AND a.active = 1
AND a.day = days_since_login
GROUP BY
a.user_id
But that's not possible.
Upvotes: 0
Views: 39
Reputation: 782775
The problem is that when you use GROUP BY
, any of the columns outside the grouped column will be fetched from an unpredictable row in the group. It won't search the group for rows that match the HAVING
clause, because that filtering is done after grouping.
You need to join with a subquery that performs the grouping.
SELECT a.day, a.user_id, b.days_since_login
FROM stats AS a
JOIN (SELECT user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login
FROM users
GROUP BY user_id) AS b
ON a.user_id = b.user_id AND a.day = b.days_since_login
WHERE a.active = 1
Upvotes: 1