randombits
randombits

Reputation: 48500

Properly using the HAVING clause in query

I have a query that looks like the following,

SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login 
  FROM stats a, users b
  WHERE a.user_id = b.user_id
  AND a.active = 1
 GROUP BY 
  a.user_id
 HAVING 
  a.day = days_since_login 

The problem I have is that if the stats table has more than one record for the user, zero rows return. If there is exactly one row, it'll return it.

I think my use of HAVING is incorrect here. The query planner seems to only filter based on the result set from the WHERE clause. Ideally, I'd have this illegal query:

SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login 
  FROM stats a, users b
  WHERE a.user_id = b.user_id
  AND a.active = 1
  AND a.day = days_since_login
 GROUP BY 
  a.user_id

But that's not possible.

Upvotes: 0

Views: 39

Answers (1)

Barmar
Barmar

Reputation: 782775

The problem is that when you use GROUP BY, any of the columns outside the grouped column will be fetched from an unpredictable row in the group. It won't search the group for rows that match the HAVING clause, because that filtering is done after grouping.

You need to join with a subquery that performs the grouping.

SELECT a.day, a.user_id, b.days_since_login
FROM stats AS a
JOIN (SELECT user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login 
      FROM users 
      GROUP BY user_id) AS b
ON a.user_id = b.user_id AND a.day = b.days_since_login
WHERE a.active = 1

Upvotes: 1

Related Questions