Reputation: 8905
Given I have a table that has 2 decimal field x and y. I want to query records that has "x >= y". What type of index should I use to improve performance of such query?
Upvotes: 1
Views: 50
Reputation: 246493
An index will only be useful if the condition is selective, that is, if only a small percentage of the rows satisfy the condition. Otherwise a sequential scan is the cheapest way to find the matching rows.
I can think of two ways to accomplish this:
Rewrite the query and create a matching expression index:
CREATE INDEX ON mytable ((x - y));
Run ANALYZE
on the table to collect statistics for the index expression.
Then rewrite the query like this:
EXPLAIN SELECT * FROM mytable WHERE x - y >= 0;
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on mytable (cost=4.70..12.53 rows=55 width=20)
Recheck Cond: ((x - y) >= '0'::double precision)
-> Bitmap Index Scan on mytable_expr_idx (cost=0.00..4.69 rows=55 width=0)
Index Cond: ((x - y) >= '0'::double precision)
(4 rows)
Use a partial index:
CREATE INDEX ON mytable((bool 'TRUE')) WHERE x >= y;
Again ANALYZE
the table to get statistics.
This index can be used with the original query:
EXPLAIN SELECT * FROM mytable WHERE x >= y;
QUERY PLAN
------------------------------------------------------------------------------------
Index Scan using mytable_bool_idx on mytable (cost=0.14..17.74 rows=333 width=20)
(1 row)
If you index the selected columns instead of the (small) dummy value TRUE
and the table is not modified a lot, you can make that an index only scan which does not have to touch the table at all. The downside is that the index gets larger. You'd have to check what is best for you.
My test table contains 1000 rows, and 54 of them match the condition.
So the first method may be better, because the estimates are better. The partial index scan only uses a dumb estimate of one third of the table size.
Upvotes: 1