Reputation: 38400
I have a table that looks like this:
ID StartRange EndRange
----------------------------
1 1 3
2 4 8
3 9 12
And so on and so forth, so that there are over 5 million records. The last record looks something like this:
ID StartRange EndRange
---------------------------------
5235976 9894727374 9894727378
In other words, the StartRange
and EndRange
will never overlap for each record.
I need to do a query that finds the corresponding ID of a number that matches the range:
SELECT ID FROM BigTable WHERE '5000000' BETWEEN StartRange AND EndRange;
Unfortunately, this query takes several seconds to complete. I need to optimize it so that it takes the least amount of execution time. I did a little bit of research it looks like adding an index is not helpful because it would only apply if the number is exactly the StartRange
or EndRange
value, but not if it's between.
Does anyone have any tips or tricks I can use to bring down the execution time? Ideally I'd want it to be under 1 second if possible.
Upvotes: 4
Views: 2746
Reputation: 11393
Add a compound index to your table. This index must be made of the StartRange
and EndRange
fields:
ALTER TABLE `BigTable` ADD INDEX ( `StartRange` , `EndRange` );
Then use EXPLAIN
on your query to check that the new index is used:
EXPLAIN SELECT ID FROM BigTable WHERE '5000000' BETWEEN StartRange AND EndRange;
The output shows that MySQL is unable to use the new index with this query. You may then rewrite your initial query:
SELECT ID FROM BigTable WHERE StartRange>='5000000' AND EndRange<='5000000'
OR EndRange>='5000000' AND StartRange<='5000000'
This new query will return the same results as your initial query. The good news are with EXPLAIN
:
EXPLAIN SELECT ID FROM BigTable WHERE StartRange>='5000000' AND EndRange<='5000000'
OR EndRange>='5000000' AND StartRange<='5000000'
The output now shows that MySQL is able to use the new index.
Upvotes: 3
Reputation: 50970
An index will not speed up this query. Indexes can be used for BETWEEN searches but only whey they're "right way around" (eg StartRange BETWEEN 10000 AND 20000
).
To speed up this query you're going to have to resort to some trickery.
First off, if the range table is static or does not grow rapidly, and if the range values are really integers, you could generate an extra table containing all values from the lowest StartRange to the highest EndRange along with the matching id. Then you could search for the exact value you need.
Alternatively, calculate the largest value of EndRange - StartRange and call it MaxRange. Create an index on StartRange and change your query to:
SELECT ID FROM BigTable
WHERE StartRange BETWEEN ('5000000' - MaxRange) AND '5000000'
AND '5000000' BETWEEN StartRange AND EndRange;
Now, the first BETWEEN clause is indexable and should return a small number of rows. The second BETWEEN clause will then be applied only to that small subset of rows. Obviously, this relies on your being able to calculate a safe value of MaxRange in advance. Hopefully there's some actual maximum possible value for the range that will tell you this number.
Upvotes: 1
Reputation: 4624
I had a similar problem with a table of ip address ranges and the below really did the trick for me. You'll want an index on at least StartRange.
SELECT ID
FROM BigTable
INNER JOIN
(SELECT MAX(StartRange) AS start
FROM BigTable
WHERE StartRange <= @Target) AS s
ON StartRange = s.start
WHERE EndRange >= @Target;
Upvotes: 6
Reputation: 23265
An index should handle this query just fine, even if the value does not match the StartRange
and EndRange
.
Upvotes: 2