Kkau
Kkau

Reputation: 41

MySQL queries stuck in "sending data" for 30 seconds after migrating to RDS

This query (along with a few others I think have a related issue) did not take 30 seconds when MySQL was local on the same EC2 instance as the rest of the website. More like milliseconds.

Does anything look off?

SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id = 
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id = 
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND 
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
LEFT JOIN chv_follows ON chv_follows.follow_followed_user_id = 
chv_images.image_user_id
LEFT JOIN chv_follows_projects ON 
chv_follows_projects.follows_project_project_id = 
chv_images.image_project_id LEFT JOIN chv_projects ON 
chv_projects.project_id = follows_project_project_id WHERE 
chv_follows.follow_user_id='1' OR (follows_project_user_id = 1 AND 
chv_projects.project_privacy = "public" AND 
chv_projects.project_is_public_upload = 1)  GROUP BY chv_images.image_id 
ORDER BY chv_images.image_id DESC
LIMIT 0,15

And this is what EXPLAIN shows:

Thank you

Update: This query has the same issue. It does not have a GROUP BY.

 SELECT *, chv_images.image_id FROM chv_images
 LEFT JOIN chv_storages ON chv_images.image_storage_id = 
 chv_storages.storage_id
 LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
 LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
 LEFT JOIN chv_categories ON chv_images.image_category_id = 
 chv_categories.category_id
 LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
 LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND 
 chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1

 ORDER BY chv_images.image_id DESC
 LIMIT 0,15

Upvotes: 2

Views: 1355

Answers (3)

Rick James
Rick James

Reputation: 142208

(Bill's ideas are good. I'll take the discussion a different way...)

Explode-Implode -- If the LEFT JOINs match no more than 1 row, change, for example,

SELECT 
    ...
    LEFT JOIN  chv_meta  ON chv_images.image_id = chv_meta.image_id

into

SELECT ...,
    ( SELECT foo FROM chv_meta WHERE image_id = chv_images.image_id ) AS foo, ...

If that can be done for all the JOINs, you can get rid of GROUP BY. This will avoid the costly "explode-implode" where JOINs lead to more rows, then GROUP BY gets rid of the dups. (I suspect you can't move all the joins in.)

OR -> UNION -- OR is hard to optimize. Your query looks like a good candidate for turning into UNION, then making more indexes that will become useful.

    WHERE  chv_follows.follow_user_id='1'
      OR  (follows_project_user_id = 1
              AND  chv_projects.project_privacy = "public"
              AND  chv_projects.project_is_public_upload = 1
          )

Assuming that follows_project_user_id is in `chv_images,

( SELECT ...
    WHERE chv_follows.follow_user_id='1' )
UNION DISTINCT   -- or ALL, if you are sure there won't be dups
( SELECT ...
    WHERE follows_project_user_id = 1
      AND  chv_projects.project_privacy = "public"
      AND  chv_projects.project_is_public_upload = 1 )

Indexes needed:

chv_follows:  (follow_user_id)
chv_projects: (project_privacy, project_is_public_upload) -- either order

But this has not yet handled the ORDER BY and LIMIT. The general pattern for such:

( SELECT ... ORDER BY ... LIMIT 15 )
UNION
( SELECT ... ORDER BY ... LIMIT 15 )
ORDER BY ... LIMIT 15

Yes, the ORDER BY and LIMIT are repeated.

That works for page 1. If you want the next 15 rows, see http://mysql.rjweb.org/doc.php/pagination#pagination_and_union

After building those two sub-selects, look at them; I think you will be able to optimize each one, and may need new indexes because the Optimizer will start with a different 'first' table.

Upvotes: 0

Bill Karwin
Bill Karwin

Reputation: 562230

That EXPLAIN shows several table-scans (type: ALL), so it's not surprising that it takes over 30 seconds.

Here's your EXPLAIN: enter image description here

Notice the column rows shows an estimated 14420 rows read from the first table chv_images. It's doing a table-scan of all the rows.

In general, when you do a series of JOINs, you can multiple together all the values in the rows column of the EXPLAIN, and the final result is how many row-reads MySQL has to do. In this case it's 14420 * 2 * 1 * 1 * 2 * 1 * 916, or 52,834,880 row-reads. That should put into perspective the high cost of doing several table-scans in the same query.

You might help avoid those table-scans by creating some indexes on these tables:

ALTER TABLE chv_storages
  ADD INDEX (storage_id);

ALTER TABLE chv_categories
  ADD INDEX (category_id);

ALTER TABLE chv_likes
  ADD INDEX (like_content_id, like_content_type, like_user_id);

Try creating those indexes and then run the EXPLAIN again.

The other tables are already doing lookups by primary key (type: eq_ref) or by secondary key (type: ref) so those are already optimized.

Your EXPLAIN shows your query uses a temporary table and filesort. You should reconsider whether you need the GROUP BY, because that's probably causing the extra work.

Another tip is to avoid using SELECT * because it might be forcing the query to read many extra columns that you don't need. Instead, explicitly name only the columns you need.

Upvotes: 1

Oleksandr Tymchenko
Oleksandr Tymchenko

Reputation: 1

Is there any indexes in chv_images?

I propose:

CREATE INDEX idx_image_id ON chv_images (image_id); 

Upvotes: 0

Related Questions