Reputation: 736
I saw this query as an answer to another question on this site:
SELECT MAX(date), thread_id
FROM table
GROUP BY thread_id
HAVING MAX(date) < 1555
With this database sample:
+-----------------------------+
| id | date | thread_id |
+-----+---------+-------------+
| 1 | 1111 | 4 |
| 2 | 1333 | 4 |
| 3 | 1444 | 5 |
| 4 | 1666 | 5 |
+-----------------------------+
Am I correct in assuming MAX(date)
is computed twice here?
If so, this would definitely reduce the efficiency of this query. Is it possible to refactor the query so that MAX(date)
is only computed once, so that performance can be maximised?
Upvotes: 0
Views: 37
Reputation: 522084
A peek into the query pipeline/execution plan will answer your question. During the GROUP BY
aggregation step, MySQL will compute the max date for each thread_id
. Then, during the HAVING
filter, the max date will already be available to use. So, I would expect MAX(date)
to be computed only once.
Note that MySQL actually permits using aliases in the HAVING
clause, so you could have written your query as:
SELECT thread_id, MAX(date) AS max_date
FROM yourTable
GROUP BY thread_id
HAVING max_date < 1555;
Upvotes: 1
Reputation: 5161
Absolutly NOT !
The letters SQL means Structured Query Language. The most important word into this name is QUERY that means it is not a procedural language. In a procedural language, you write the exact commands that you want the computer to do. In SQL, a "query" language, you do not write a program code, but only the desired answer, then the SQL algrebrizer/optimizer have to compute the program that will be executed by the query processor (known as "query execution plan").
SQL is translated into relational algebra which is a simple mathematic formula and then be simplified by the algrebrizer like the work you've done at school when the teacher gives you a complex equation to solve : factorization, substitution...
The SQL engine will do the same, by factorizing the MAX(date) that will be compute once only !
Upvotes: 1