PostgreSQL query is not using an index

Question

Enviroment

My PostgreSQL (9.2) schema looks like this:

CREATE TABLE first
(
   id_first bigint NOT NULL,
   first_date timestamp without time zone NOT NULL,
   CONSTRAINT first_pkey PRIMARY KEY (id_first)
)
WITH (
   OIDS=FALSE
);

CREATE INDEX first_first_date_idx
   ON first
   USING btree
     (first_date);

CREATE TABLE second
(
   id_second bigint NOT NULL,
   id_first bigint NOT NULL,
   CONSTRAINT second_pkey PRIMARY KEY (id_second),
   CONSTRAINT fk_first FOREIGN KEY (id_first)
      REFERENCES first (id_first) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
   OIDS=FALSE
);

CREATE INDEX second_id_first_idx
   ON second
   USING btree
   (id_first);

CREATE TABLE third
(
   id_third bigint NOT NULL,
   id_second bigint NOT NULL,
   CONSTRAINT third_pkey PRIMARY KEY (id_third),
   CONSTRAINT fk_second FOREIGN KEY (id_second)
      REFERENCES second (id_second) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
   OIDS=FALSE
);

CREATE INDEX third_id_second_idx
   ON third
   USING btree
   (id_second);

So, I have 3 tables with own PK. First has an index on first_date, Second has a FK from First and index on it. Third as a FK from Second and index on it aswell:

 First (0 --> n) Second (0 --> n) Third

First table contains about 10 000 000 records. Second table contains about 20 000 000 records. Third table contains about 18 000 000 records.

Date range in column first_date is from 2016-01-01 till today.

random_cost_page is set to 2.0. default_statistics_target is set to 100. All FK, PK and first_date STATISTICS are set to 5000

Task to do

I want to count all Third rows connected with First, where first_date < X

My query:

SELECT count(t.id_third) AS count
FROM first f
JOIN second s ON s.id_first = f.id_first 
JOIN third t ON t.id_second = s.id_second
WHERE first_date < _my_date

Problem description

Asking for 2 days - _my_date = '2016-01-03'

Everything working pretty well. Query lasts 1-2 seconds. EXPLAIN ANALYZE:

"Aggregate  (cost=8585512.55..8585512.56 rows=1 width=8) (actual time=67.310..67.310 rows=1 loops=1)"
"  ->  Merge Join  (cost=4208477.49..8583088.04 rows=969805 width=8) (actual time=44.277..65.948 rows=17631 loops=1)"
"        Merge Cond: (s.id_second = t.id_second)"
"        ->  Sort  (cost=4208477.48..4211121.75 rows=1057709 width=8) (actual time=44.263..46.035 rows=19230 loops=1)"
"              Sort Key: s.id_second"
"              Sort Method: quicksort  Memory: 1670kB"
"              ->  Nested Loop  (cost=0.01..4092310.41 rows=1057709 width=8) (actual time=6.169..39.183 rows=19230 loops=1)"
"                    ->  Index Scan using first_first_date_idx on first f  (cost=0.01..483786.81 rows=492376 width=8)  (actual time=6.159..12.223 rows=10346 loops=1)"
"                          Index Cond: (first_date < '2016-01-03 00:00:00'::timestamp without time zone)"
"                    ->  Index Scan using second_id_first_idx on second s  (cost=0.00..7.26 rows=7 width=16) (actual time=0.002..0.002 rows=2 loops=10346)"
"                          Index Cond: (id_first = f.id_first)"
"        ->  Index Scan using third_id_second_idx on third t  (cost=0.00..4316649.89 rows=17193788 width=16) (actual time=0.008..7.293 rows=17632 loops=1)"
"Total runtime: 67.369 ms"

Asking for 10 days or more - _my_date = '2016-01-11' or more

Query is not using a indexscan anymore - replaced by seqscan and last 3-4 minutes... Query plan:

"Aggregate  (cost=8731468.75..8731468.76 rows=1 width=8) (actual time=234411.229..234411.229 rows=1 loops=1)"
"  ->  Hash Join  (cost=4352424.81..8728697.88 rows=1108348 width=8) (actual time=189670.068..234400.540 rows=138246 loops=1)"
"        Hash Cond: (t.id_second = o.id_second)"
"        ->  Seq Scan on third t  (cost=0.00..4128080.88 rows=17193788 width=16) (actual time=0.016..124111.453 rows=17570724 loops=1)"
"        ->  Hash  (cost=4332592.69..4332592.69 rows=1208810 width=8) (actual time=98566.740..98566.740 rows=151263 loops=1)"
"              Buckets: 16384  Batches: 16  Memory Usage: 378kB"
"              ->  Hash Join  (cost=561918.25..4332592.69 rows=1208810 width=8) (actual time=6535.801..98535.915 rows=151263 loops=1)"
"                    Hash Cond: (s.id_first = f.id_first)"
"                    ->  Seq Scan on second s  (cost=0.00..3432617.48 rows=18752248 width=16) (actual time=6090.771..88891.691 rows=19132869 loops=1)"
"                    ->  Hash  (cost=552685.31..552685.31 rows=562715 width=8) (actual time=444.630..444.630 rows=81650 loops=1)"
"                          ->  Index Scan using first_first_date_idx on first f  (cost=0.01..552685.31 rows=562715 width=8) (actual time=7.987..421.087 rows=81650 loops=1)"
"                                Index Cond: (first_date < '2016-01-13 00:00:00'::timestamp without time zone)"
"Total runtime: 234411.303 ms"

For test purposes, I have set:

 SET enable_seqscan = OFF;

My queries start using indexscan again and last for 1-10 s (depends on range).

Question

Why this is working like that? How to convince a Query Planner to use a indexscan?

EDIT

After reducing a random_page_cost to 1.1, I can select about 30 days now still using a indexscan. Query plan changed a little bit:

"Aggregate  (cost=8071389.47..8071389.48 rows=1 width=8) (actual  time=4915.196..4915.196 rows=1 loops=1)"
"  ->  Nested Loop  (cost=0.01..8067832.28 rows=1422878 width=8) (actual time=14.402..4866.937 rows=399184 loops=1)"
"        ->  Nested Loop  (cost=0.01..3492321.55 rows=1551849 width=8) (actual time=14.393..3012.617 rows=436794 loops=1)"
"              ->  Index Scan using first_first_date_idx on first f  (cost=0.01..432541.99 rows=722404 width=8) (actual time=14.372..729.233 rows=236007 loops=1)"
"                    Index Cond: (first_date < '2016-02-01 00:00:00'::timestamp without time zone)"
"              ->  Index Scan using second_id_first_idx on second s  (cost=0.00..4.17 rows=7 width=16) (actual time=0.008..0.009 rows=2 loops=236007)"
"                    Index Cond: (second = f.id_second)"
"        ->  Index Scan using third_id_second_idx on third t  (cost=0.00..2.94 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=436794)"
"              Index Cond: (id_second = s.id_second)"
"Total runtime: 4915.254 ms"

However, I still don get it why asking for more couse a seqscan...

Iteresting is that, when I ask for range just above some kind of limit I getting a Query plan like this (here select for 40 days - asking for more will produce full seqscan again):

"Aggregate  (cost=8403399.27..8403399.28 rows=1 width=8) (actual time=138303.216..138303.217 rows=1 loops=1)"
"  ->  Hash Join  (cost=3887619.07..8399467.63 rows=1572656 width=8) (actual time=44056.443..138261.203 rows=512062 loops=1)"
"        Hash Cond: (t.id_second = s.id_second)"
"        ->  Seq Scan on third t  (cost=0.00..4128080.88 rows=17193788 width=16) (actual time=0.004..119497.056 rows=17570724 loops=1)"
"        ->  Hash  (cost=3859478.04..3859478.04 rows=1715203 width=8) (actual time=5695.077..5695.077 rows=560503 loops=1)"
"              Buckets: 16384  Batches: 16  Memory Usage: 1390kB"
"              ->  Nested Loop  (cost=0.01..3859478.04 rows=1715203 width=8) (actual time=65.250..5533.413 rows=560503 loops=1)"
"                    ->  Index Scan using first_first_date_idx on first f  (cost=0.01..477985.28 rows=798447 width=8) (actual time=64.927..1688.341 rows=302663 loops=1)"
"                          Index Cond: (first_date < '2016-02-11 00:00:00'::timestamp without time zone)"
"                    ->  Index Scan using second_id_first_idx on second s (cost=0.00..4.17 rows=7 width=16) (actual time=0.010..0.012 rows=2 loops=302663)"
"                          Index Cond: (id_first = f.id_first)"
"Total runtime: 138303.306 ms"

UPDATE after Laurenz Able suggestions

After rewritting a query plan as Laurenz Able suggested:

"Aggregate  (cost=9102321.05..9102321.06 rows=1 width=8) (actual time=15237.830..15237.830 rows=1 loops=1)"
"  ->  Merge Join  (cost=4578171.25..9097528.19 rows=1917143 width=8) (actual time=9111.694..15156.092 rows=803657 loops=1)"
"        Merge Cond: (third.id_second = s.id_second)"
"        ->  Index Scan using third_id_second_idx on third  (cost=0.00..4270478.19 rows=17193788 width=16) (actual time=23.650..5425.137 rows=803658 loops=1)"
"        ->  Materialize  (cost=4577722.81..4588177.38 rows=2090914 width=8) (actual time=9088.030..9354.326 rows=879283 loops=1)"
"              ->  Sort  (cost=4577722.81..4582950.09 rows=2090914 width=8) (actual time=9088.023..9238.426 rows=879283 loops=1)"
"                    Sort Key: s.id_second"
"                    Sort Method: external sort  Disk: 15480kB"
"                    ->  Merge Join  (cost=673389.38..4341477.37 rows=2090914 width=8) (actual time=3662.239..8485.768 rows=879283 loops=1)"
"                          Merge Cond: (s.id_first = f.id_first)"
"                          ->  Index Scan using second_id_first_idx on second s  (cost=0.00..3587838.88 rows=18752248 width=16) (actual time=0.015..4204.308 rows=879284 loops=1)"
"                          ->  Materialize  (cost=672960.82..677827.55 rows=973345 width=8) (actual time=3662.216..3855.667 rows=892988 loops=1)"
"                                ->  Sort  (cost=672960.82..675394.19 rows=973345 width=8) (actual time=3662.213..3745.975 rows=476519 loops=1)"
"                                      Sort Key: f.id_first"
"                                      Sort Method: external sort  Disk: 8400kB"
"                                      ->  Index Scan using first_first_date_idx on first f (cost=0.01..568352.90 rows=973345 width=8) (actual time=126.386..3233.134 rows=476519 loops=1)"
"                                            Index Cond: (first_date < '2016-03-03 00:00:00'::timestamp without time zone)"
"Total runtime: 15244.404 ms"

Laurenz Albe · Accepted Answer

First, it looks like some of the estimates are off.
Try to ANALYZE the tables and see if that changes the query plan chosen.

What might also help is to lower random_page_cost to a value just over 1 and see if that improves the plan.

It is interesting to note that the index scan on third_id_second_idx in the fast query produces only 17632 rows instead of over 17 million, which I can only explain by assuming that from that row on, the values of id_second no longer match any row in the join of first and second, i.e. the merge join is completed after that.

You can try to exploit that with with a rewritten query. Try

JOIN (SELECT id_second, id_third FROM third ORDER BY id_second) t

instead of

JOIN third t

That may result in a better plan since PostgreSQL won't optimize the ORDER BY away, and the planner may decide that since it has to sort third anyway, it may be cheaper to use a merge join. That way you trick the planner into choosing a plan that it wouldn't recognize as ideal. With a different value distribution the planner's original choice would probably be better.

PostgreSQL query is not using an index

Answers (1)

Related Questions