PostgreSQL query takes 100x longer when I make the WHERE clauses more restrictive, counterintuitively

Question

I have a PostgreSQL query that's taking more than 15 seconds. When I change some of the values in the WHERE clauses, the query goes over 100x faster, even though the changes broaden the scope of the query.

Query #1 takes 15.2 seconds: SELECT p.category_id, count(*) FROM cr JOIN p ON p_id = p.id WHERE p.category_id = 48 AND cr.created_date > '2019-08-23 21:00:00 +0000' GROUP BY p.category_id;

Query #2 takes 44 milliseconds: SELECT p.category_id, count(*) FROM cr JOIN p ON p_id = p.id WHERE p.category_id = 48 AND cr.created_date > '2017-08-23 21:00:00 +0000' GROUP BY p.category_id; The only change is to created_date, where I've included two more years worth of cr rows, yet the query is super fast.

Query #3 takes 292 milliseconds: SELECT p.category_id, count(*) FROM cr JOIN p ON p_id = p.id WHERE p.category_id < 49 AND cr.created_date > '2019-08-23 21:00:00 +0000' GROUP BY p.category_id;. The only change from query #1 is to category_id, where I've included 47 more possible ids (1-48 instead of just 48), yet the query is still far faster than the original.

I'm at a loss. Queries #2 and #3 should be much more time-consuming, but it's only the first one that takes a long time.

Edit:

Here's the output when I do EXPLAIN (ANALYZE, BUFFERS) SELECT ... with query #1:

GroupAggregate  (cost=289.67..301.73 rows=1 width=12) (actual time=19075.641..19075.642 rows=1 loops=1)
  Group Key: p.category_id
  Buffers: shared hit=606998
  ->  Nested Loop  (cost=289.67..301.71 rows=1 width=4) (actual time=10.354..19069.554 rows=14609 loops=1)
        Buffers: shared hit=606998
        ->  Index Scan using _idx_category_id on p  (cost=0.42..8.44 rows=1 width=8) (actual time=0.018..3.676 rows=2642 loops=1)
              Index Cond: (category_id = 48)
              Buffers: shared hit=544
        ->  Bitmap Heap Scan on cr  (cost=289.25..293.26 rows=1 width=4) (actual time=7.203..7.211 rows=6 loops=2642)
              Recheck Cond: ((p_id = p.id) AND (created_date > '2019-08-23 21:00:00+00'::timestamp with time zone))
              Heap Blocks: exact=14591
              Buffers: shared hit=606454
              ->  BitmapAnd  (cost=289.25..289.25 rows=1 width=0) (actual time=7.199..7.199 rows=0 loops=2642)
                    Buffers: shared hit=591863
                    ->  Bitmap Index Scan on cr_p_id_index  (cost=0.00..9.88 rows=193 width=0) (actual time=0.006..0.006 rows=6 loops=2642)
                          Index Cond: (p_id = p.id)
                          Buffers: shared hit=7981
                    ->  Bitmap Index Scan on cr_created_date_index  (cost=0.00..276.71 rows=9637 width=0) (actual time=7.189..7.189 rows=79774 loops=2642)
                          Index Cond: (created_date > '2019-08-23 21:00:00+00'::timestamp with time zone)
                          Buffers: shared hit=583882
Planning time: 0.431 ms
Execution time: 19075.698 ms

And here it is for query #2:

GroupAggregate  (cost=0.85..720.03 rows=1 width=12) (actual time=31.628..31.630 rows=1 loops=1)
  Group Key: p.category_id
  Buffers: shared hit=24780
  ->  Nested Loop  (cost=0.85..719.92 rows=20 width=4) (actual time=0.024..27.561 rows=16281 loops=1)
        Buffers: shared hit=24780
        ->  Index Scan using _idx_category_id on p  (cost=0.42..8.44 rows=1 width=8) (actual time=0.015..1.883 rows=2642 loops=1)
              Index Cond: (category_id = 48)
              Buffers: shared hit=544
        ->  Index Scan using cr_p_id_index on cr  (cost=0.43..709.55 rows=193 width=4) (actual time=0.002..0.007 rows=6 loops=2642)
              Index Cond: (p_id = p.id)
              Filter: (created_date > '2017-08-23 21:00:00+00'::timestamp with time zone)
              Buffers: shared hit=24236
Planning time: 0.251 ms
Execution time: 31.663 ms

And here it is for query #3:

HashAggregate  (cost=52583.40..52583.67 rows=27 width=12) (actual time=389.141..389.150 rows=22 loops=1)
  Group Key: p.category_id
  Buffers: shared hit=89513, temp read=1187 written=1173
  ->  Hash Join  (cost=20169.93..52535.14 rows=9651 width=4) (actual time=235.938..363.334 rows=79992 loops=1)
        Hash Cond: (cr.p_id = p.id)
        Buffers: shared hit=89513, temp read=1187 written=1173
        ->  Index Scan using cr_created_date_index on cr  (cost=0.43..30914.94 rows=9651 width=4) (actual time=0.010..50.387 rows=79992 loops=1)
              Index Cond: (created_date > '2019-08-23 21:00:00+00'::timestamp with time zone)
              Buffers: shared hit=78529
        ->  Hash  (cost=14955.75..14955.75 rows=317740 width=8) (actual time=235.394..235.394 rows=321138 loops=1)
              Buckets: 131072  Batches: 8  Memory Usage: 2608kB
              Buffers: shared hit=10984, temp written=957
              ->  Seq Scan on p  (cost=0.00..14955.75 rows=317740 width=8) (actual time=0.010..127.499 rows=321138 loops=1)
                    Filter: (category_id < 49)
                    Rows Removed by Filter: 3
                    Buffers: shared hit=10984
Planning time: 0.241 ms
Execution time: 389.209 ms

jjanes · Accepted Answer

->  Index Scan using _idx_category_id on p  (cost=0.42..8.44 rows=1 width=8) (actual time=0.015..1.883 rows=2642 loops=1)

When the estimated number of rows is off by a factor of 2642, bad plans will happen.

If you ANALYZE the table (ANALYZE P;), does that fix the problem? If so, then you have to wonder why the autovacuum system was not doing an autoanalyze on it.

PostgreSQL query takes 100x longer when I make the WHERE clauses more restrictive, counterintuitively

Answers (2)

Related Questions