Does using HAVING in SQL here compute an aggregate function a second time?

Question

I saw this query as an answer to another question on this site:

SELECT MAX(date), thread_id
FROM table
GROUP BY thread_id 
HAVING MAX(date) < 1555

With this database sample:

+-----------------------------+
|  id |   date  |  thread_id  |
+-----+---------+-------------+
|  1  |   1111  |      4      |
|  2  |   1333  |      4      |
|  3  |   1444  |      5      |
|  4  |   1666  |      5      |
+-----------------------------+

Am I correct in assuming MAX(date) is computed twice here?

If so, this would definitely reduce the efficiency of this query. Is it possible to refactor the query so that MAX(date) is only computed once, so that performance can be maximised?

Tim Biegeleisen · Accepted Answer

A peek into the query pipeline/execution plan will answer your question. During the GROUP BY aggregation step, MySQL will compute the max date for each thread_id. Then, during the HAVING filter, the max date will already be available to use. So, I would expect MAX(date) to be computed only once.

Note that MySQL actually permits using aliases in the HAVING clause, so you could have written your query as:

SELECT thread_id, MAX(date) AS max_date
FROM yourTable
GROUP BY thread_id 
HAVING max_date < 1555;

Does using HAVING in SQL here compute an aggregate function a second time?

Answers (2)

Related Questions