How to avoid sequential scan in PostgreSQL query

Question

I have a 3-layer master-detail-subdetail table setup.

master: 100 rows
detail: 300,000 rows, with between 5 and 60,000 details per master row, average 3000 details
subdetail: 30,000,000 rows, about 100 subdetails per detail row, and up to 2M per master

When I do a query to get all the subdetails for a single master, PostgreSQL always wants to do a parallel sequential scan on subdetail, even when I know up front (due to denormalization of counts) that there's only 200k or so rows to fetch for my chosen master. I can control whether I want to execute the query or not at the application level based on application statistics of the subdetail count.

I want to force PostgreSQL to join master -> detail -> subdetail in that order, and not hash(master -> detail) -> parallel_scan(subdetail), which is what it persists in doing. I only issue the query when I know up front that it will be executable in reasonable time; scanning all subdetails takes too long, though.

To make that concrete, I might have SQL like this:

select subdetail.kind, count(*)
from subdetail
join detail on detail.id = subdetail.detail_id
join master on master.id = detail.master_id
where master.name = 'foo'
group by 1

I end up with a plan like this:

 Finalize GroupAggregate  (cost=841664.47..841665.23 rows=3 width=10) (actual time=34452.098..34452.100 rows=2 loops=1)
   Group Key: subdetail.kind
   Buffers: shared hit=1813 read=216528 dirtied=81 written=19
   ->  Gather Merge  (cost=841664.47..841665.17 rows=6 width=10) (actual time=34452.089..34454.982 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=5435 read=648798 dirtied=229 written=43
         ->  Sort  (cost=840664.45..840664.46 rows=3 width=10) (actual time=34447.488..34447.489 rows=2 loops=3)
               Sort Key: subdetail.kind
               Sort Method: quicksort  Memory: 25kB
               Worker 0:  Sort Method: quicksort  Memory: 25kB
               Worker 1:  Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=5435 read=648798 dirtied=229 written=43
               ->  Partial HashAggregate  (cost=840664.39..840664.42 rows=3 width=10) (actual time=34447.456..34447.457 rows=2 loops=3)
                     Group Key: subdetail.kind
                     Buffers: shared hit=5421 read=648798 dirtied=229 written=43
                     ->  Hash Join  (cost=3603.43..840029.99 rows=126881 width=2) (actual time=5002.903..34426.634 rows=76274 loops=3)
                           Hash Cond: (subdetail.detail_id = detail.id)
                           Buffers: shared hit=5421 read=648798 dirtied=229 written=43
                           ->  Parallel Seq Scan on subdetail  (cost=0.00..785977.10 rows=13195610 width=6) (actual time=0.053..32007.809 rows=10796566 loops=3)
                                 Buffers: shared hit=5223 read=648798 dirtied=229 written=43
                           ->  Hash  (cost=3567.34..3567.34 rows=2887 width=4) (actual time=3.060..3.060 rows=2753 loops=3)
                                 Buckets: 4096  Batches: 1  Memory Usage: 129kB
                                 Buffers: shared hit=176
                                 ->  Nested Loop  (cost=102.80..3567.34 rows=2887 width=4) (actual time=0.321..1.953 rows=2753 loops=3)
                                       Buffers: shared hit=176
                                       ->  Seq Scan on master  (cost=0.00..2.30 rows=1 width=4) (actual time=0.035..0.038 rows=1 loops=3)
                                             Filter: ((name)::text = 'foo'::text)
                                             Rows Removed by Filter: 103
                                             Buffers: shared hit=3
                                       ->  Bitmap Heap Scan on detail  (cost=102.80..3536.17 rows=2887 width=8) (actual time=0.278..1.095 rows=2753 loops=3)
                                             Recheck Cond: (master_id = master.id)
                                             Heap Blocks: exact=35
                                             Buffers: shared hit=173
                                             ->  Bitmap Index Scan on index_detail_on_master_id_name  (cost=0.00..102.08 rows=2887 width=0) (actual time=0.255..0.255 rows=2753 loops=3)
                                                   Index Cond: (master_id = master.id)
                                                   Buffers: shared hit=68

In MySQL, I could adjust my query:

select subdetail.kind, count(*)
from master
straight_join detail on detail.master_id = master.id
straight_join subdetail on subdetail.detail_id = detail.id
where master.name = 'foo'
group by 1

But PostgreSQL does not have similar hinting.

The only way I've found to get this kind of query to run quickly is to fetch the detail ids separately, then embed the detail ids into the subdetail query in a giant IN. This approach gets me a query time of 131ms, vs 31 seconds with the sequential scan, for a master with about 200k subdetails. But surely there must be a better way, when the application has knowledge over the expected result set due to denormalization, to encourage PostgreSQL to execute a plan which might not be most efficient for all possible queries, but is known to be more efficient for the specific parameters chosen, than to break the query up at the application level.

I'm running PostgreSQL 11.6.

jjanes · Accepted Answer

Since you know ahead of time when you want to do this, just do a set enable_seqscan to off before you execute the query, and reset it after.

How to avoid sequential scan in PostgreSQL query

Answers (1)

Related Questions