Reputation: 1605
I am new to MySQL, and I need to add indexes on an existing table (which contains roughly 200K rows).
Table mytable: (id:integer, created_time:timestamp, deleted_time:timestamp)
I have 2 queries which need to benefit from the index:
select s.id from mytable s
where s.completed_time is not null
and s.completed_time < ?
and ( s.deleted_time is null
or s.deleted_time >= ? );
and :
select s.id from mytable s
where
s.completed_time is not null
and (
( s.deleted_time is not null
and s.deleted_time >= ?
and s.deleted_time < ? )
or ( s.completed_time >= ?
and s.completed_time < ? ) ) ;
I am considering introducing a multi column index (on completed_time and deleted_time) However, I am not sure if the condition "s.completed_time is not null" matches the criteria to make these queries use the composite index.
Do you have any thoughts about what is best (composite index or 2 indexes)? I am trying to use "explain" to figure out what's best but I am unsure on how to interpret the results.
And more generally: with a table having a composite index on (column1, column2), I understand that filtering on column2 only will not use the index. But what if I introduce a dummy condition like (column1 > MIN_VALUE), or (column1 is not null) when it is correct to do so?
Thanks!
Upvotes: 1
Views: 398
Reputation: 29759
Assuming the two queries you mentionned will be "frequently" used, I would advise a composite index versus two distinct indexes on two columns.
As you already know, a query searching on two columns might sometimes use two separate indexes by (roughly) merging these two indexes into one. But this is sub-optimal, and has a cost in terms of performance.
Conversely, a composite index can only be used if the left-most columns are involved in the seach condition, or as the manual puts it:
MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on
With regards to your suggested hack (introducing dummy conditions so as to be able to use the index), this might work, but I would rather advise creating a second index on column2
only (besides the two-column index on (column1, column2)
). This comes at a (minor) cost, but is so much more elegant and reusable.
As for the suggestion of getting rid of NULL
values, I strongly disagree. It is sematnically incorrect to use 0
. 0
means "zero", NULL
means "no value". All your tests would need to account for this special value, whereas IS NULL
is standard and well understood everywhere. It is also just impractical in some situations (try to insert 0
with SQL_MODE='TRADITIONAL'
).
On the other hand, the performance gain is dubious (I believe this is mostly based on the false assumption that NULL
values are not indexed). It is easy to verify that a query like s.completed_time IS NOT NULL
will hit an index if such an index exists.
Upvotes: 1
Reputation: 742
Provided you get rid of the nulls as mentioned by a user. A composite index might work for the first query.
But the second query the index may not work. - As it has an OR between your indexed columns.
Usually it goes by the Left-Most columns in the index and in that order.
I suggest creating separate indexes. Though it has an overhead (May be) of http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
Upvotes: 1