Reputation: 20737
I have a query that is slow for particular rows. Postgres is choosing to do a Seq Scan
instead of using an Index Scan
for some of the rows, I assume because it's actually going to be faster than using the index.
Here's the query plan using the indexes for a normal kind of workload: http://explain.depesz.com/s/1A2o :
EXPLAIN (ANALYZE, BUFFERS) SELECT "blocks".* FROM "blocks" INNER JOIN "jobs" ON "blocks"."job_id" = "jobs"."id" WHERE "jobs"."project_id" = 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.71..166.27 rows=19 width=130) (actual time=0.092..4.247 rows=2421 loops=1)
Buffers: shared hit=350
-> Index Scan using index_jobs_on_project_id on jobs (cost=0.29..18.81 rows=4 width=4) (actual time=0.044..0.099 rows=15 loops=1)
Index Cond: (project_id = 1)
Buffers: shared hit=17
-> Index Scan using index_blocks_on_job_id on blocks (cost=0.42..36.67 rows=19 width=130) (actual time=0.021..0.133 rows=161 loops=15)
Index Cond: (job_id = jobs.id)
Buffers: shared hit=333
Total runtime: 4.737 ms
(9 rows)
Here's the query plan choosing to do a sequential scan for a less normal kind of workload: http://explain.depesz.com/s/cJOd :
EXPLAIN (ANALYZE, BUFFERS) SELECT "blocks".* FROM "blocks" INNER JOIN "jobs" ON "blocks"."job_id" = "jobs"."id" WHERE "jobs"."project_id" = 2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1138.64..11236.94 rows=10421 width=130) (actual time=5.212..72.604 rows=2516 loops=1)
Hash Cond: (blocks.job_id = jobs.id)
Buffers: shared hit=5671
-> Seq Scan on blocks (cost=0.00..8478.06 rows=303206 width=130) (actual time=0.008..24.573 rows=298084 loops=1)
Buffers: shared hit=5446
-> Hash (cost=1111.79..1111.79 rows=2148 width=4) (actual time=3.346..3.346 rows=2164 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 77kB
Buffers: shared hit=225
-> Bitmap Heap Scan on jobs (cost=40.94..1111.79 rows=2148 width=4) (actual time=0.595..2.158 rows=2164 loops=1)
Recheck Cond: (project_id = 2)
Buffers: shared hit=225
-> Bitmap Index Scan on index_jobs_on_project_id (cost=0.00..40.40 rows=2148 width=0) (actual time=0.516..0.516 rows=2164 loops=1)
Index Cond: (project_id = 2)
Buffers: shared hit=8
Total runtime: 72.767 ms
(15 rows)
In the first case, the project has 15 jobs and 2421 blocks. In the second case, the project has 2164 jobs and 2516 blocks.
Is there a way to query this data so that the second workload isn't so slow? Or am I just approaching some kind of worst-case performance workload?
Edit
After updating the random_page_cost to 1.1 and re-running EXPLAIN for the slow query: http://explain.depesz.com/s/xKdd
EXPLAIN (ANALYZE, BUFFERS) SELECT "blocks".* FROM "blocks" INNER JOIN "jobs" ON "blocks"."job_id" = "jobs"."id" WHERE "jobs"."project_id" = 2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.71..7634.08 rows=10421 width=130) (actual time=0.025..10.597 rows=2516 loops=1)
Buffers: shared hit=9206
-> Index Scan using index_jobs_on_project_id on jobs (cost=0.29..1048.99 rows=2148 width=4) (actual time=0.015..1.239 rows=2164 loops=1)
Index Cond: (project_id = 32357)
Buffers: shared hit=225
-> Index Scan using index_blocks_on_job_id on blocks (cost=0.42..2.88 rows=19 width=130) (actual time=0.003..0.003 rows=1 loops=2164)
Index Cond: (job_id = jobs.id)
Buffers: shared hit=8981
Total runtime: 10.925 ms
(9 rows)
Much better! Looks like I need to invest some time in tuning the server config.
Upvotes: 4
Views: 10044
Reputation: 324841
As a nested loop over two index scans is so much faster than the hashjoin over a bitmap index scan, I'd say your random_page_cost
does not accurately reflect your real performance, at least when data is cached in RAM or shared_buffers
.
Try setting SET random_page_cost = 1.1
and re-run in that session. You might also want to throw more work_mem
at the problem.
If the random_page_cost
adjustment is effective, you'll probably want to update postgresql.conf
to reflect it. Note that 1.1 is a pretty extreme setting; the default is 4, and seq_page_cost
is 1, so in the config file I'd start with something more like 2 or 1.5 to avoid making other plans worse.
Upvotes: 12