Martin AJ
Martin AJ

Reputation: 6697

How should I create index when there is OR between conditions?

I have a query like this:

DELETE FROM events WHERE type = 4 AND author_id IN (?, ?) AND post_id IN (?, ?)

As you know, author_id IN (? , ?) translates to author_id = ? OR author_id = ?. So should I create an index on author_id column?

In fact I'm trying to understand how MySQL treats with conditions when there is OR between them. I read this sentence somewhere:

MySQL seems to ignore indices once it encounters an OR.

But I cannot understand the meaning of it. Can anybody? Overall what indexes do I need for query above?

Upvotes: 1

Views: 412

Answers (1)

Barmar
Barmar

Reputation: 782105

It's true that OR usually prevents effective use of indexes. A query can generally only use one index per table when process the WHERE clause. So if you have a condition like

WHERE col1 = 1 OR col2 = 2

it can't use any single index to find all the matching rows. You're generally best off splitting the query into two and using UNION:

SELECT ...
FROM TABLE
WHERE col1 = 1
UNION
SELECT ...
FROM TABLE
WHERE col2 = 2

However, I think I read somewhere that in simple cases MySQL is able to perform this transformation automatically. Check the EXPLAIN output for your query.

But when the OR condition involves the same column, it can still use the index, e.g. if you have

WHERE col1 = 1 OR col1 = 2

it just has to look up these two index entries in the index for col1. And this is what it does when you write:

WHERE col1 IN (1, 2)

However, your query is more complex than this, because it's also combining multiple conditions with AND. If you were just using = comparisons, it could optimize this with a multi-column index, e.g.

INDEX (type, author_id, post_id)

But when you have multiple IN conditions, this can't be done, because it would need to enumerate all the combinations.

type = 4 AND author_id IN (1, 2) AND post_id IN (10, 20)

would require looking up all these indexes:

4, 1, 10
4, 1, 20
4, 2, 10
4, 2, 20

I don't think MySQL will generate all the combination like this (but I could be wrong -- check the EXPLAIN output to see if it can).

Instead, it will select whichever index it thinks will be most effective, use that to find a the rows that match that part of the condition, and scan those rows to evaluate the remaining conditions.

Upvotes: 3

Related Questions