Reputation: 1780
Say I have a table foo(bar:int)
with a normal btree index on bar
, and the table contains 100 rows (with bar
having values 2 to 101). When running the following query, how does MySQL decide whether to do the >=
or the <=
first?
SELECT bar from foo where bar BETWEEN 0 AND 1
If it did the >=
then it would scan all 100 rows. On the other hand if it did the <=
it would do 0 scans. Is there a way to specify which to do first?
This is particularly relevant for me for datetime
range queries on very large tables containing years of historic data and where the timeframe requested is close to the current time. If it did the <=
first then there would be a huge scan on the many years worth of data. For example:
SELECT * from table
WHERE instant BETWEEN DATE_SUB(NOW(), INTERVAL 1 HOUR) AND DATE_SUB(NOW(), INTERVAL 1 MINUTE);
Upvotes: 1
Views: 340
Reputation: 12181
If I understand your question correctly: When a B-tree index is created usually it is a B+tree http://en.wikipedia.org/wiki/B%2B_tree
Wikipedia:"A simple B+ tree example linking the keys 1–7 to data values d1-d7. The linked list (red) allows rapid in-order traversal."
That means that the smallest element in the range is found (in your case the earliest date) in approximately log_b(N) time and then there are k hops from all the leafs of the B+ tree till all the elements in the range are exhausted.
k is the number of elements in the range that exist in the database and not all the possible values, N is the height of the tree (in Wikipedia example it is 2) and b is the branching factor of the tree (in Wikipedia example it is 3)
Edit: Cases:
In the case of only foo<=1 it goes in the B+tree and doesn't find anything so we have 0 scans.
In the case of only foo>=0 it will not find 0 but the first in it's values and it will go to 2 in your case. Then it will do the 100 scans
When you have 'foo 0 between 40' it is like foo<=40 AND foo>=0, so it will go to the first, in your case to 2 and then do (38hopes/scans assuming that all the values are in the database). In other words, they are not executed separately, so they will use the index together!
In general Sql servers have optimizers that can detect ranges and rewrite your queries by putting your ANDs in the right order. They also keep track of the query performance and they decide after estimating cost plans which execution plan to follow. If you have SQL Server you can see all these plans with any query.
Upvotes: 1