Reputation: 3
I'm running the following two queries on a innodb table that contains ~3M rows. For some reason the first query takes less than a second to return 168,199 rows whereas the second query takes 8 seconds and returns 167,159 rows? The time it takes to return almost the same results increases by a factor of 10?
SELECT count(idActivities) as amt
FROM Activities
WHERE Data_Type='email'
AND Status='sent'
AND (Created > '2019-07-17 00:00:00'
AND Created <= '2019-08-17 00:00:00');
SELECT count(idActivities) as amt
FROM Activities
WHERE Data_Type='email'
AND Status='sent'
AND (Created > '2019-08-17 00:00:00'
AND Created <= '2019-09-17 00:00:00');
If I remove the additional where clauses from the second statement e.g.
SELECT count(idActivities) as amt
FROM Activities
WHERE (Created > '2019-08-17 00:00:00'
AND Created <= '2019-09-17 00:00:00');
The query time drops to half a second. If I add a single additional where clause to the statement e.g. Data_Type= or Status= it jumps back to 8 plus seconds.
I've also tried moving the dates a few days in either direction but this doesn't affect the query time.
The table is indexed on idActivities, Data_Type, Status and Created.
The server is running 5GB Ram, 8 cores and has innodb_buffer_pool_size=3G, InnoDB buffer usage is at 49%.
I've tried the same query on a different server and the results are roughly the same time around ~4 seconds which is still pretty slow.
The only difference I've noticed is that the cardinality of the Data_Type column is different even though the table is virtually the same (it's a back up from the day before).
I would appreciate any help or assistance in understanding how to improve the query time? Running "DISTINCT(Data_Type)" only returns 13 rows for the entire table.
EDITED Thank you Salman A, by adding the following composite index massively improved the performance of both queries:
CREATE INDEX ix_1 ON Activities (Created, Data_Type, Status);
Upvotes: 0
Views: 293
Reputation: 272096
I would suggest creating the following covering index:
CREATE INDEX ix_1 ON t (Data_Type, Status, Created)
The order of columns matter. High cardinality columns are usually placed first, but for this particular query you need to put the created column at the end since it involves range comparison (the first two require equality comparison).
Upvotes: 3
Reputation: 28834
You need to define a Composite Index here with a specific order of columns in it. The general rule of thumb is:
First priority should be given to all the columns, which are inside the WHERE
clause, and are connected by AND
clause, and are compared to a constant value using either =
, IS NULL
, or <=>
. So, in your query, there are two columns following this: Data_Type
and Status
.
Second priority can be given to following three scenarios:
GROUP BY
clause (if existing).ORDER BY
clause (if existing)In this case, Created
is a Range condition, so we will add that column to the index at the end, because on encountering range condition, MySQL stops at the column, and does not access further columns in the index.
So, you basically need to define the following index:
ALTER TABLE Activities ADD INDEX(Data_Type, Status, Created);
Quoting from Rick James' notes:
When you have a composite index that could be in any order, the cardinality of the individual columns does not matter in picking the order. The cardinality of the entire index is what matters.
Upvotes: 2
Reputation: 1061
When you have unexpected behavior like this I always check what the server is doing by getting it to explain what it's doing.
You can do this using the EXPLAIN keyword at the start of the query. I would guess that the slower query is scanning the entire table for your result set
https://dev.mysql.com/doc/refman/8.0/en/using-explain.html
an aside note: unless your data_type and status columns are very selective (at a guess they would not be). I would take a guess that they are pointless indexes. Typically indexing status columns is only a good idea when you have a large proportion of one status type. i.e 95% closed 'status' and 5% 'open' and you have a query where you want to find all open statuses.
Upvotes: 0