gta0004
gta0004

Reputation: 508

Postgres using inefficient query plan when using OR filter, uses correct indices when using IN clause

I have a query that is being ran on PGSQL, and when queried at a fast rate for large data sets, it is taking a long time to run because it isn't making use of the available indexes. I found that changing the filter from multiple OR's to an IN clause causes the right index to be used. Is there a way I can force the index to be used even when using OR's?

Query with Disjunction:

SELECT field1, field2,..., fieldN
FROM table1 WHERE
((((field9='val1' OR field9='val2') OR field9='val3') OR field9='val4')
AND (field6='val5'));

Query Plan:

"Bitmap Heap Scan on table1  (cost=18.85..19.88 rows=1 width=395) (actual time=0.017..0.017 rows=0 loops=1)"
"  Recheck Cond: (((field6)::text = 'val5'::text) AND (((field9)::text = 'val1'::text) OR ((field9)::text = 'val2'::text) OR ((field9)::text = 'val3'::text) OR ((field9)::text = 'val4'::text)))"
"  ->  BitmapAnd  (cost=18.85..18.85 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)"
"        ->  Bitmap Index Scan on idx_field6_field9  (cost=0.00..9.01 rows=611 width=0) (actual time=0.015..0.015 rows=0 loops=1)"
"              Index Cond: ((field6)::text = 'val5'::text)"
"        ->  BitmapOr  (cost=9.59..9.59 rows=516 width=0) (never executed)"
"              ->  Bitmap Index Scan on idx_id_field9  (cost=0.00..2.40 rows=129 width=0) (never executed)"
"                    Index Cond: ((field9)::text = 'val1'::text)"
"              ->  Bitmap Index Scan on idx_id_field9  (cost=0.00..2.40 rows=129 width=0) (never executed)"
"                    Index Cond: ((field9)::text = 'val2'::text)"
"              ->  Bitmap Index Scan on idx_id_field9  (cost=0.00..2.40 rows=129 width=0) (never executed)"
"                    Index Cond: ((field9)::text = 'val3'::text)"
"              ->  Bitmap Index Scan on idx_id_field9  (cost=0.00..2.40 rows=129 width=0) (never executed)"
"                    Index Cond: ((field9)::text = 'val4'::text)"
"Planning time: 0.177 ms"
"Execution time: 0.061 ms"

Query with IN

SELECT field1, field2,..., fieldN
FROM table1
WHERE
((field9 IN ('val1', 'val2', 'val3', 'val4'))
AND (field6='val5'));

Query Plan:

"Index Scan using idx_field6_field9 on table1  (cost=0.43..6.77 rows=1 width=395) (actual time=0.032..0.032 rows=0 loops=1)"
"  Index Cond: (((field6)::text = 'val5'::text) AND ((field9)::text = ANY ('{val1,val2,val3,val4}'::text[])))"
"Planning time: 0.145 ms"
"Execution time: 0.055 ms"

There is an index on field 6 and field 9 which the second query uses as expected, which the first one also should. Field9 is also kind of like a state field, so its cardinality is extremely low - there's only like 9 different values across the whole table. Unfortunately, it isn't straightforward to change the query to use an IN clause, so getting PG to use the right plan would be ideal.

Upvotes: 1

Views: 227

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246103

There is no way you can get the fast plan (single index scan) using the OR condition. You'll have to rewrite the query.

You want to know why, which is always difficult to explain. With optimizations like that, there are usually two reasons:

  1. Nobody got around to do it.

  2. This requires extra effort every time a query with an OR is planned:

    • Are there several conditions linked with OR that have the same expression on one side?

    • Both plans, the original and the rewritten one, would have to be estimated. It may well be that the BitmapOr is the most efficient way to process the query.

    This price would have to be paid by every query with OR in it.

I am not saying that it is a bad idea to add an optimization like this, but there are two sides to the coin.

Upvotes: 1

Related Questions