Reputation: 135
I have a MyISAM table with almost 1 billion records, with say, three fields: a, b and c.
The table has a btree multi-field index on columns a, b and c in that order. Analyzing the index shows that the cardinalities for the fields in that index are:
a: 112 (int)
b: 2694 (int)
c: 936426795 (datetime)
Which means that there are around 100 different values for a, around 20 different values for b, and for each combination of a and b, a whole lot of values of c.
I want to perform a query over a specific value of a, and a range over c. Something like
select a, b, c from mytable where a=4 and c >= "2011-01-01 00:00:00" and c < "2011-01-02 00:00:00"
Getting the query explained shows me that it will indeed use the index, but I don't know if it will use only the first field of the index and then scan over the rest of the table, or if it will be smart enough to apply the third field index, for each value of b, which would be the same as executing 20 different queries, one for each different value of b.
Anybody who knows the internal working of mysql indices can answer this question?
Edit: I'm not asking whether or not I can have mysql to use the index over only a and c. I know how btrees work, and I know that you can only use it over a, a and b, or a and b and c. I would like to know if the mysql optimizer is smart enough to apply the index over all the values in b so it can use the a+b+c index, considering that the cardinality of b is extremely small.
Consider an even simpler example. A table with two columns: a and b, and the index has cardinality 1 over a and 10000000 over b. Mysql should be smart enough to know that there's only one value of a, therefore this index is equivalent to an index only over b, and should use this index when performing queries only over b.
Upvotes: 1
Views: 2427
Reputation: 6854
Below are some facts related with B-TREE index usage by mysql and one example to understand this logic.
a) If any table has approx. 75% same data then index will not be used instead mysql will do table scan.
b) Normally mysql use only single index per table.
c) Index ordering methodology: Mysql will use index as per their order.
For example there is an combined index on a, b and c field idx_a_b_c(a,b,c)
i. select a, b, c from mytable where a=4
This query will use index as 'a' column is first in index order.
ii. select a, b, c from mytable where a=4 and b=5
This query will use combined index on a & b as these column are continue in index order.
iii. select a, b, c from mytable where a=4 and b=5 and c >= "2011-01-01 00:00:00"
This query will use combined index on a, b & c as these column are continue in index order.
iv. select a, b, c from mytable where c >= "2011-01-01 00:00:00"
This query will not use index as mysql consider index from left most corner and column c is not a left most column in index.
v. select a, b, c from mytable where a=4 and c >= "2011-01-01 00:00:00" and c < "2011-01-02 00:00:00"
This query will use only index on 'a' column but not of 'c' column as continuity is breaking here from left side. So this query will use index on a column and then scan table for column c for corresponding rows as per filter on column a.
Upvotes: 0
Reputation: 31
The question makes sense from the point of view that some database engines are smart enough to scan the index rather than scanning the table. (And they allow "data" to be stored in the index for this exact reason.) Scanning the index will be faster than joining the index to the base data, then limiting (excluding) returned rows based on the where clause.
It would make sense that only the rows in the index that meet the where condition (on columns in the index) are joined. Particularly if you are running a large key cache...
It would appear this doesn't happen in MySQL which is disappointing. Therefore no.
Upvotes: 0
Reputation: 52372
MySQL Reference Manual :: How MySQL Uses Indexes
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3). MySQL cannot use an index if the columns do not form a leftmost prefix of the index.
a,c is not a leftmost prefix of the index a,b,c so the index cannot be used to resolve the search on c.
Upvotes: 4