Reputation: 50600
My problem is with this query in MySQL:
select
SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
FROM parts
-- where parttypeid = 1
When the where
is uncommented, my run time jumps for 4.5 seconds to 341 seconds. There are approximately 21M total records in this table.
My EXPLAIN
looks like this, which seems to indicate that it is utilizing the INDEX I have on PartTypeId
.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE parts ref PartTypeId PartTypeId 1 const 11090057
I created my table using this query:
CREATE TABLE IF NOT EXISTS parts (
Id INTEGER NOT NULL PRIMARY KEY,
PartTypeId TINYINT NOT NULL,
OrderThreshold INTEGER NOT NULL,
PartName VARCHAR(500),
INDEX(Id),
INDEX(PartTypeId),
INDEX(OrderThreshold),
);
The query with out the WHERE
returns
LOW_COUNT HIGH_COUNT
3570 3584
With the where
the results look like this:
LOW_COUNT HIGH_COUNT
2791 2147
How can I improve the performance of my query to keep the run time down in the seconds (instead of minutes) range when adding a where
statement that only looks at one column?
Upvotes: 2
Views: 61
Reputation: 1270391
Your accepted answer doesn't explain what is going wrong with your original query:
select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts
where parttypeid = 1;
The index is being used to find the results, but there are a lot of rows with parttypeid = 1
. I am guessing that each data page probably has at least one such row. That means that all the rows are being fetched, but they are being read out-of-order. That is slower than just doing a full table scan (as in the first query). In other words, all the data pages are being read, but the index is adding additional overhead.
As Juergen points out, a better form of the query moves the conditions into the where
clause:
select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts
where parttypeid = 1 AND
(OrderThreshold < @LOW_COST OR OrderThreshold > @HIGH_COST)
(I prefer this form, because the where
conditions match the case
conditions.) For this query, you want an index on parts(parttypeid, OrderThreshold)
. I'm not sure about the MySQL optimizer in this case, but it might be better to write as:
select 'Low' as which, count(*) as CNT
from parts
where parttypeid = 1 AND
OrderThreshold < @LOW_COST
union all
select 'High', count(*) as CNT
from parts
where parttypeid = 1 AND
OrderThreshold > @HIGH_COST;
Each subquery should definitely use the index in this case. (If you want them in one row with two columns, there are a couple ways to achieve that, but I'm guessing that is not so important.)
Unfortunately, the best index for your query without the where
clause is parts(OrderThreshold)
. This is a different index from the above.
Upvotes: 2
Reputation: 204854
Try
select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts
where parttypeid = 1
and OrderThreshold not between @LOW_COST and @HIGH_COST
and
select count(*) as LOW_COUNT, null as HIGH_COUNT
from parts
where parttypeid = 1
and OrderThreshold < @LOW_COST
union all
select null, count(*)
from parts
where parttypeid = 1
and OrderThreshold > @HIGH_COST
Upvotes: 4