Reputation: 12847
I have a messages
table with 15 million rows.
Below query returns about 5 million(But limited in 15 rows) records in less than 1 second :
SELECT messages.* FROM messages
INNER JOIN gateways ON
messages.gateway_id=gateways.id
INNER JOIN orders ON
gateways.order_id=orders.id
WHERE orders.user_id=6500 AND messages.deleted=0
AND messages.type='Out' LIMIT 15;
But when I adding an Order By
idDESC
to the end of it, it becomes extremely slow about ~40 seconds :
SELECT messages.* FROM messages
INNER JOIN gateways ON
messages.gateway_id=gateways.id
INNER JOIN orders ON
gateways.order_id=orders.id
WHERE orders.user_id=6500 AND messages.deleted=0
AND messages.type='Out' ORDER BY messages.id DESC LIMIT 15;
Any help would be great appreciated.
Upvotes: 1
Views: 167
Reputation: 11581
I assume that
Therefore, this:
INNER JOIN gateways ON messages.gateway_id=gateways.id
INNER JOIN orders ON gateways.order_id=orders.id
WHERE orders.user_id=6500 AND messages.deleted=0
can be rephrased into English as:
"Get the gateways which belong to the orders which belong to this user".
Now, to get the most recent messages related to this user, the problem is we will likely have many different gateway_id's (about 143 according to your EXPLAIN), so we can't use an index to skip the sort.
Well, we could, as O. Jones demonstrated, but there is a catch. Here is the simplified version of the query:
SELECT ... FROM messages
WHERE gateway_id IN (1,2) ORDER BY id DESC LIMIT 10
If we have an index on (id,gateway_id) then MySQL will most likely decide to scan it in descending order. If it quickly finds 10 messages having "gateway_id IN (1,2)" then it will be fast. However, it could need to scan the entire index if these gateway_ids have very old messages, or no messages at all.
If the PK relationships are as I described, I would materialize a user_id column inside the messages table, which would then allow an index on (user_id,message_id) which would give query timings well under a millisecond.
Upvotes: 1
Reputation: 108651
The pattern SELECT lots_of_stuff ORDER BY something LIMIT small_integer
is notorious for causing performance trouble. Leaving out ORDER BY something
makes the performance trouble go away. Why? Because the pattern with ORDER BY
causes the MySQL server to sort a very large number of fairly large rows (5 million in your case), only to discard all but a small number of them. That uses a lot of RAM, CPU, and IO in your server, just to discard most of the work.
Your best bet is to use a deferred join type of pattern here, where you sort nothing but the message.id
values. Use this subquery to do that.
SELECT messages.id
FROM messages
INNER JOIN gateways ON messages.gateway_id=gateways.id
INNER JOIN orders ON gateways.order_id=orders.id
WHERE orders.user_id=6500
AND messages.deleted=0
AND messages.type='Out'
ORDER BY messages.id DESC
LIMIT 15
This will give you a nice little collection of 15 message.id
values.
Your next step is to work on optimizing this subquery. I suggest you try a compound covering index on your messages
table, containing the columns (deleted, type, id, gateway_id)
. This shoud help accelerate it.
You may need indexes on the other tables as well. You should consider using the EXPLAIN
function in MySQL to analyze your performance.
Finally, use that little collection of messages.id
values to get the messages
rows you need, like this. (This is the deferred join; you're deferring fetching the whole row until you know which rows you'll need. That way you don't have to ORDER
the whole mess.)
Edit Add a compound index on gateways (order_id, id)
to avoid the full table scan of that table. It's not enormous, but this might help a bit.
SELECT a.*
FROM messages a
JOIN (
SELECT messages.id
FROM messages
INNER JOIN gateways ON messages.gateway_id=gateways.id
INNER JOIN orders ON gateways.order_id=orders.id
WHERE orders.user_id=6500
AND messages.deleted=0
AND messages.type='Out'
ORDER BY messages.id DESC
LIMIT 15
) b ON a.id = b.id
ORDER BY a.id DESC
Upvotes: 1