Reputation: 1168
I have a complex MySQL query which contains multiple calculations and joins to retrieve a list of contractors by location. The contractors table (users) contains 100,000+ rows and growing. The issue I'm having is that the query takes over 1.5 seconds to execute which is causing a significant delay to the page load.
I have found that removing the ORDER BY clause, the speed is increased significantly (< 0.05s). After looking through other related questions on Stack Overflow, I understand why this is the case but have not yet found a viable solution.
It's also worth noting I have added indexes as suggested in other posts but I believe you cannot optimize any further when sorting on calculated columns. (Please correct me if I'm wrong)
Here is the query (I have removed several columns and joins for simplicity but still the query takes the same time to execute):
SELECT `users`.`id`,
`users`.`username`,
IF (Max(up.premium_expires_at) > Now(), 1, 0) AS `is_premium`,
IF (users.last_online_at >= Now() - INTERVAL 30 day, 1, 0) AS `recent_login`,
IF (da.id IS NOT NULL, 1, 0) AS `is_available`,
( 3959 * Acos(Cos(Radians(53.80592)) * Cos(Radians(lat)) * Cos(
Radians(lng) - Radians(-1.53834
)) + Sin(Radians(53.80592)) *
Sin(Radians(lat))) ) AS
`distance`
FROM `users`
INNER JOIN `users_places` AS `up`
ON `users`.`id` = `up`.`user_id`
INNER JOIN `places` AS `mp`
ON `users`.`place_id` = `mp`.`id`
LEFT JOIN `users_dates_available` AS `da`
ON `da`.`user_id` = `users`.`id`
AND `from` <= Curdate()
AND `to` >= Curdate()
LEFT JOIN (SELECT user_id,
Sum(score) AS score
FROM users_feedback
WHERE status = 1
GROUP BY user_id) AS feedback
ON `users`.`id` = `feedback`.`user_id`
WHERE `users`.`status` = 1
AND `users`.`approved` = 1
GROUP BY `users`.`id`
HAVING `distance` < 50
ORDER BY `is_premium` DESC,
`recent_login` DESC
LIMIT 5
And here's the results of EXPLAIN
So I guess my question is: What is the quickest way to display this data on a web page?
What I've tried:
The query is part of a Laravel application. I've tried running the query without the ORDER BY and sorting by PHP. However execution times remain slow.
Running the query without LEFT joins and I noticed significant improvements to speed. However the query must use LEFT joins for the calculations in the SELECT criteria (we're checking for NULL values).
Using Views - still same query speeds with a pre-compiled view.
The only other option I can think of is to create a temporary table which contains all the calculated fields and query this. However this will not store the 'distance' column as this is specific to the user running the query and I will still be sorting by a calculated column.
Is there another option or another way to optimize this query that I'm missing? Thanks
Upvotes: 0
Views: 133
Reputation: 142278
The query seems not to use feedback
, so remove the LEFT JOIN
. That will save on some wasted effort.
Similarly, places
seems to be useless, except as an existence test.
Which table are lat
and lng
in? (I cannot finish my analysis without knowing what table each column is in.)
Are from
and to
of datatype DATE
? If so, the WHERE
clause involving them seems to say "anytime today". Is that correct?
Get some of that cleaned up. After that, I may be able to suggest moving one of the Joins until after the GROUP BY
and LIMIT
. Or maybe the GROUP BY
can be eliminated.
Some indexes that might be useful:
users: INDEX(status, approved, id, username, last_online_at, place_id)
up: INDEX(user_id, premium_expires_at)
da: INDEX(user_id, id)
users_feedback: INDEX(status, user_id)
Distance
The problem with distance queries is that the simple SELECT
requires checking every row in the table. This is slow. I have a blog on what to do to improve performance for the general "find nearest" problem. It discusses 5 approaches, starting with the least efficient (what your code does): http://mysql.rjweb.org/doc.php/find_nearest_in_mysql
Upvotes: 2