Reputation: 508
I have a query that is being ran on PGSQL, and when queried at a fast rate for large data sets, it is taking a long time to run because it isn't making use of the available indexes. I found that changing the filter from multiple OR's to an IN clause causes the right index to be used. Is there a way I can force the index to be used even when using OR's?
Query with Disjunction:
SELECT field1, field2,..., fieldN
FROM table1 WHERE
((((field9='val1' OR field9='val2') OR field9='val3') OR field9='val4')
AND (field6='val5'));
Query Plan:
"Bitmap Heap Scan on table1 (cost=18.85..19.88 rows=1 width=395) (actual time=0.017..0.017 rows=0 loops=1)"
" Recheck Cond: (((field6)::text = 'val5'::text) AND (((field9)::text = 'val1'::text) OR ((field9)::text = 'val2'::text) OR ((field9)::text = 'val3'::text) OR ((field9)::text = 'val4'::text)))"
" -> BitmapAnd (cost=18.85..18.85 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)"
" -> Bitmap Index Scan on idx_field6_field9 (cost=0.00..9.01 rows=611 width=0) (actual time=0.015..0.015 rows=0 loops=1)"
" Index Cond: ((field6)::text = 'val5'::text)"
" -> BitmapOr (cost=9.59..9.59 rows=516 width=0) (never executed)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val1'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val2'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val3'::text)"
" -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40 rows=129 width=0) (never executed)"
" Index Cond: ((field9)::text = 'val4'::text)"
"Planning time: 0.177 ms"
"Execution time: 0.061 ms"
Query with IN
SELECT field1, field2,..., fieldN
FROM table1
WHERE
((field9 IN ('val1', 'val2', 'val3', 'val4'))
AND (field6='val5'));
Query Plan:
"Index Scan using idx_field6_field9 on table1 (cost=0.43..6.77 rows=1 width=395) (actual time=0.032..0.032 rows=0 loops=1)"
" Index Cond: (((field6)::text = 'val5'::text) AND ((field9)::text = ANY ('{val1,val2,val3,val4}'::text[])))"
"Planning time: 0.145 ms"
"Execution time: 0.055 ms"
There is an index on field 6 and field 9 which the second query uses as expected, which the first one also should. Field9 is also kind of like a state field, so its cardinality is extremely low - there's only like 9 different values across the whole table. Unfortunately, it isn't straightforward to change the query to use an IN clause, so getting PG to use the right plan would be ideal.
Upvotes: 1
Views: 227
Reputation: 246103
There is no way you can get the fast plan (single index scan) using the OR
condition. You'll have to rewrite the query.
You want to know why, which is always difficult to explain. With optimizations like that, there are usually two reasons:
Nobody got around to do it.
This requires extra effort every time a query with an OR
is planned:
Are there several conditions linked with OR
that have the same expression on one side?
Both plans, the original and the rewritten one, would have to be estimated. It may well be that the BitmapOr is the most efficient way to process the query.
This price would have to be paid by every query with OR
in it.
I am not saying that it is a bad idea to add an optimization like this, but there are two sides to the coin.
Upvotes: 1