Ludacia
Ludacia

Reputation: 69

WHERE clauses with multiple values too slow even after index created

I have table checked_result with columns idx,A, B and C with 200.000+ records.

I created indices for idx and each column and combination of those columns, 16 indices in total : idx,A,B,C,AB,BC,AC,ABC,(idx,A), (idx,B), (idx,C), (idx,A,B), (idx,A,C), (idx,B,C), (idx,A,B,C).

All indices should be used according to how query filters. When I use queries for single column :

SELECT * FROM checked_result WHERE (A in ('123')) AND idx >= 0 ORDER BY B DESC LIMIT 10
SELECT * FROM checked_result WHERE (A in ('123')) AND idx >= 0 ORDER BY B DESC, C ASC LIMIT 10

it's fast, but when I filter multiple values in one column :

SELECT * FROM checked_result WHERE (A in ('123','456')) AND idx >= 0 ORDER BY B DESC
SELECT * FROM checked_result WHERE (A in ('123','456')) AND idx >= 0 ORDER BY B DESC, C ASC LIMIT 10

it's really slow. Why does this happen even after indexing? How can I make it faster?

EXPLAIN QUERY PLAN SELECT * FROM checked_result WHERE (A in ('123')) AND idx >= 0
EXPLAIN QUERY PLAN SELECT * FROM checked_result WHERE (A in ('123','456')) AND idx >= 0

Each shows the following result:

SEARCH TABLE checked_result USING INDEX idx_checked_result_A_B (A=?)
SCAN TABLE checked_result USING INDEX idx_checked_result_A_B_C

Upvotes: 0

Views: 121

Answers (1)

Andrea B.
Andrea B.

Reputation: 754

You should build only the indexes that are really needed by your queries. And with many indexes potentially usable, it's vital to run ANALYZE to let sqlite pick the right one.

As pointed in the comments, a lot of your indexes are redundant. If you have index ABC, then index A and index AB are not necessary. Also, if you have inequalities in your WHERE clauses, order of columns in the index is significant. In your query an index (A,idx) could be used and would be useful, but your index (idx,A) can not be used. Since you don't have a (A,idx) index, the only one that can be used to reduce the number of records to be read is index (A). (EDIT: Your query planner actually chooses index AB instead of A, because of ORDER BY B, but it doesn't change much).

Index ABC is then used to return the records in your chosen order. If you only had ABC index, it would probably be better, because sqlite would use the same index for the WHERE and for the ORDER BY clauses.

Even using indexes, the speed of the query depends on how much the chosen index is selective. If your records have thousands of different A values, then the index will be very selective and only a small portion of the table will be read. But if you have only a dozen different values, then the query will have to read a big portion of the table and every additional value that you put in your IN clause will worsen the situation.

More info on the sqlite query optimizer here: https://www.sqlite.org/optoverview.html

Upvotes: 2

Related Questions