Reputation: 9
So I'm having a problem when I add in ORDER BY date_last_access DESC
the whole query slows down to 3secs and without it's about 0.2secs, why is it running so slow and how can I change the query to run faster?
There are also indexes on all the tables and fields used.
Users: 1+ million records Likes: 5+ million records (over 1 billion in production)
Tables will be growing really fast once in production.
QUERY
SELECT
id,
sid,
first_name,
date_birth,
location,
date_created,
date_last_access,
(3956 * 2 * ASIN(
SQRT(
POWER(
SIN(
({LAT} - latitude) * pi() / 180 / 2
),
2
) + COS({LAT} * pi() / 180) * COS(latitude * pi() / 180) * POWER(
SIN(
({LON} - longitude) * pi() / 180 / 2
),
2
)
)
)) AS distance
FROM
users
WHERE
`id` != {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
(SELECT COUNT(*) FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID}) = 0
HAVING distance <= {DIST}
ORDER BY date_last_access DESC
LIMIT {ROWS}
EXPLAIN
1 PRIMARY users ref PRIMARY,Index_2,discovery,index_1 index_1 2 const 226184 Using index condition; Using where; Using filesort
2 DEPENDENT SUBQUERY likes eq_ref PRIMARY,index_1,index_2 PRIMARY 16 const,hello.users.id 1 Using index
INDEXES
LIKES - user_id
, judged_user
- NORMAL - BTREE
USERS - id
, gender
, date_birth
, status
, date_last_access
- NORMAL - BTREE
When I order by id
instead of date_last_access
it seems to run much faster, could it be cause date_last_access is a datetime
format?
Upvotes: 0
Views: 79
Reputation: 1269973
You should phrase the FROM
clause as:
WHERE `id` <> {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
NOT EXISTS (SELECT 1 FROM likes l WHERE l.judged_user = users.id AND l.user_id = {UID})
HAVING distance <= {DIST}
For this query, you can try two indexes:
LIKES(judged_user, user_id)
USERS(Gender, status, date_birth, id)
Upvotes: 0
Reputation: 4350
Simplyfying the subquery could be a better way to avoid extra processing time (COUNT):
(SELECT COUNT(*) FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID}) = 0
could change to
(SELECT 1 FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID} limit 1) IS NULL
Avoiding a subquery could be the best way to improve the performance of the query. You could check what options could be better for your case (an index for likes.user_id
is required in this case)
FROM
users
LEFT JOIN (
SELECT distinct judged_user FROM likes WHERE likes.user_id = {UID}
) l ON l.judged_user=users.id
WHERE
`id` != {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
l.judged_user is NULL
Upvotes: 1
Reputation: 11
First try run a EXPLAIN of your query. This will show you what fields and operations are slowing your query. Then try to make joins with indexed columns and filter you resultset with more specific values.
Upvotes: 1