Reputation: 36
when I run a query with a lateral join and a LIMIT
inside, it uses nested a loop join. But when I remove the LIMIT
it uses a Hash Right Join. Why?
EXPLAIN ANALYSE
SELECT proxy.*
FROM jobs
LEFT OUTER JOIN LATERAL (
SELECT proxy.*
FROM proxy
WHERE jobs.id = proxy.job_id
) proxy ON true
Hash Right Join (cost=2075.47..3029.05 rows=34688 width=12) (actual time=9.951..24.758 rows=35212 loops=1)
Hash Cond: (proxy.job_id = jobs.id)
-> Seq Scan on proxy (cost=0.00..524.15 rows=34015 width=12) (actual time=0.011..2.502 rows=34028 loops=1)
-> Hash (cost=1641.87..1641.87 rows=34688 width=4) (actual time=9.842..9.842 rows=34689 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 1732kB
-> Index Only Scan using jobs_pkey on jobs (cost=0.29..1641.87 rows=34688 width=4) (actual time=0.010..4.904 rows=34689 loops=1)
Heap Fetches: 921
But when I add limits to the query, the actual time jumps from 24 to 150:
EXPLAIN ANALYSE
SELECT proxy.*
FROM jobs
LEFT OUTER JOIN LATERAL (
SELECT proxy.*
FROM proxy
WHERE jobs.id = proxy.job_id
limit 1
) proxy ON true
Nested Loop Left Join (cost=0.58..290506.19 rows=34688 width=12) (actual time=0.024..155.753 rows=34689 loops=1)
-> Index Only Scan using jobs_pkey on jobs (cost=0.29..1641.87 rows=34688 width=4) (actual time=0.014..3.984 rows=34689 loops=1)
Heap Fetches: 921
-> Limit (cost=0.29..8.31 rows=1 width=12) (actual time=0.001..0.001 rows=1 loops=34689)
-> Index Scan using index_job_proxy_on_job_id on loc_job_source_materials (cost=0.29..8.31 rows=1 width=12) (actual time=0.001..0.001 rows=1 loops=34689)
Index Cond: (jobs.id = job_id)
Upvotes: 0
Views: 609
Reputation: 19350
Following on from @LaurenzAlbe's answer, I think we can help more if you show the complete query, so we know why you need a LATERAL join. For the (simplified) requirements you have mentioned so far, I think an equivalent is
SELECT DISTINCT ON(proxy.id) proxy.*
FROM proxy
RIGHT OUTER JOIN jobs
ON jobs.id = proxy.job_id;
Also, since you are only outputting columns from proxy
, you are effectively doing only an INNER JOIN, but with more computing effort.
Upvotes: 0
Reputation: 247625
The optimizer is smart enough to rewrite your first query to
SELECT proxy.*
FROM proxy
RIGHT OUTER JOIN jobs
ON jobs.id = proxy.job_id;
But this optimization cannot be made with the LIMIT
clause, so only a nested loop join is possible.
Upvotes: 2