Reputation: 5962
Why am I getting Seq scan
when I can see the partial index in \d+
commands;
\d+ call_records;
id | integer | not null default nextval('call_records_id_seq'::regclass) | plain | |
plain_crn | bigint |
active | boolean | default true
timestamp | bigint | default 0
Indexes:
"index_call_records_on_plain_crn" UNIQUE, btree (plain_crn)
"index_call_records_on_active" btree (active) WHERE active = true
As expected for id
was an Index Scan.
EXPLAIN select * from call_records where id=1;
QUERY PLAN
----------------------------------------------------------------------------------------
Index Scan using call_records_pkey on call_records (cost=0.14..8.16 rows=1 width=373)
Index Cond: (id = 1)
(2 rows)
Same goes for plain_crn
EXPLAIN select * from call_records where plain_crn=1;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Index Scan using index_call_records_on_plain_crn on call_records (cost=0.14..8.16 rows=1 width=373)
Index Cond: (plain_crn = 1)
(2 rows)
But, it not the same in case of active
.
EXPLAIN select * from call_records where active=true; QUERY PLAN
-----------------------------------------------------------------
Seq Scan on call_records (cost=0.00..12.00 rows=100 width=373)
Filter: active
(2 rows)
Upvotes: 2
Views: 47
Reputation: 95582
Whether PostgreSQL uses the index on "active" depends on the ratio of true to false. At some point where there are more true than false, the query planner will decide that table scan will probably be faster.
I built a table to test, and loaded a million rows of random(ish) data.
select active, count(*)
from call_records
group by active;
active count -- f 499983 t 500017
True and false have roughly the same number of rows. Here's the execution plan.
explain analyze
select * from call_records where active=true;
"Bitmap Heap Scan on call_records (cost=5484.82..15344.49 rows=500567 width=21) (actual time=56.542..172.084 rows=500017 loops=1)" " Filter: active" " Heap Blocks: exact=7354" " -> Bitmap Index Scan on call_records_active_idx (cost=0.00..5359.67 rows=250567 width=0) (actual time=55.040..55.040 rows=500023 loops=1)" " Index Cond: (active = true)" "Planning time: 0.105 ms" "Execution time: 204.209 ms"
Then I updated "active", updated the statistics, and checked again.
update call_records
set active = true
where id < 750000;
analyze call_records;
explain analyze
select * from call_records where active=true;
"Seq Scan on call_records (cost=0.00..22868.00 rows=874100 width=21) (actual time=0.032..280.506 rows=874780 loops=1)" " Filter: active" " Rows Removed by Filter: 125220" "Planning time: 0.316 ms" "Execution time: 337.400 ms"
Turning off sequential scans shows that, in my case, PostgreSQL made the right decision. The table scan (sequential scan) was about 10 ms faster.
set enable_seqscan = off;
explain analyze
select * from call_records where active=true;
"Index Scan using call_records_active_idx on call_records (cost=0.42..39071.14 rows=874100 width=21) (actual time=0.031..293.295 rows=874780 loops=1)" " Index Cond: (active = true)" "Planning time: 0.343 ms" "Execution time: 349.403 ms"
Upvotes: 3
Reputation: 16487
You should start with testing the cost of the index scan
SET enable_seqscan = OFF;
You'll see it's much higher than the seqscan. You probably have a very low active to total rows in your table. Since you're selecting *
Postgres still has to lookup each row and so it's much easier to do a sequential scan on all the rows rather than check the index and then have to fetch most of the pages.
Upvotes: 2