Reputation: 3846
1 - PRIMARY
used in a secondary index, e.g. secondary index on (PRIMARY
,column1
)
2 - I'm aware mysql cannot continue using the rest of an index as soon as one part was used for a range scan, however: IN (...,...,...)
is not considered a range, is it? Yes, it is a range, but I've read on mysqlperformanceblog.com that IN
behaves differently than BETWEEN
according to the use of index.
Could anyone confirm those two points? Or tell me why this is not possible? Or how it could be possible?
UPDATE:
Links:
http://www.mysqlperformanceblog.com/2006/08/10/using-union-to-implement-loose-index-scan-to-mysql/
http://www.mysqlperformanceblog.com/2006/08/14/mysql-followup-on-union-for-query-optimization-query-profiling/comment-page-1/#comment-952521
UPDATE 2: example of nested SELECT:
SELECT * FROM user_d1 uo
WHERE EXISTS (
SELECT 1 FROM `user_d1` ui
WHERE ui.birthdate BETWEEN '1990-05-04' AND '1991-05-04'
AND ui.id=uo.id
)
ORDER BY uo.timestamp_lastonline DESC
LIMIT 20
So, the outer SELECT
uses timestamp_lastonline
for sorting, the inner either PK
to connect with the outer or birthdate
for filtering.
What other options rather than this query are there if MySQL cannot use index on a range scan and for sorting?
Upvotes: 7
Views: 748
Reputation: 8406
The column(s) of the primary key can certainly be used in a secondary index, but it's not often worthwhile. The primary key guarantees uniqueness, so any columns listed after it cannot be used for range lookups. The only time it will help is when a query can use the index alone
As for your nested select, the extra complication should not beat the simplest query:
SELECT * FROM user_d1 uo
WHERE uo.birthdate BETWEEN '1990-05-04' AND '1991-05-04'
ORDER BY uo.timestamp_lastonline DESC
LIMIT 20
MySQL will choose between a birthdate
index or a timestamp_lastonline
index based on which it feels will have the best chance of scanning fewer rows. In either case, the column should be the first one in the index. The birthdate
index will also carry a sorting penalty, but might be worthwhile if a large number of recent users will have birth dates outside of that range.
If you wish to control the order, or potentially improve performance, a (timestamp_lastonline, birthdate)
or (birthdate, timestamp_lastonline)
index might help. If it doesn't, and you really need to select based on the birthdate first, then you should select from the inner query instead of filtering on it:
SELECT * FROM (
SELECT * FROM user_d1 ui
WHERE ui.birthdate BETWEEN '1990-05-04' AND '1991-05-04'
) as uo
ORDER BY uo.timestamp_lastonline DESC
LIMIT 20
Even then, MySQL's optimizer might choose to rewrite your query if it finds a timestamp_lastonline
index but no birthdate
index.
And yes, IN (..., ..., ...)
behaves differently than BETWEEN
. Only the latter can effectively use a range scan over an index; the former would look up each item individually.
Upvotes: 2
Reputation: 1420
yes @Andrius_Naruševičius is right the IN statement is merely shorthand for EQUALS OR EQUALS OR EQUALS has no inherent order whatsoever where as BETWEEN is a comparison operator with an implicit greater than or less than and therefore absolutely loves indexes
I honestly have no idea what you are talking about, but it does seem you are asking a good question I just have no notion what it is :-). Are you saying that a primary key cannot contain a second index? because it absolutely can. The primary key never needs to be indexed because it is ALWAYS indexed automatically, so if you are getting an error/warn (I assume you are?) about supplementary indices then it's not the second, third index causing it it's the PRIMARY KEY not needing it, and you mentioning that probably is the error. Having said that I have no idea what question you asked - it's my answer to my best guess as to your actual question.
Upvotes: 0
Reputation: 8578
2.IN
will obviously differ from BETWEEN
. If you have an index on that column, BETWEEN
will need to get the starting point and it's all done. If you have IN
, it will look for a matching value in the index value by value thus it will look for the values as many times as there are values compared to BETWEEN
's one time look.
Upvotes: 0