Reputation: 5706
I have two huge tables which I want to join
live_inventory_calculation: 8.1M records
demand: 3M records
Query planner is using NestledLoop Join, which obviously is taking more than a minute. There is already an index on the joining columns. Disabling the nested loop with
SET LOCAL enable_nestloop = False;
Forces the planner to use Hash Join whic takes only 12 seconds.
Here is the query
SELECT
live_inventory_calculation.csku_id,
live_inventory_calculation.dc_code,
live_inventory_calculation.date,
count(*)
FROM live_inventory_calculation
INNER JOIN demand ON
live_inventory_calculation.csku_id = demand.sku_id
AND live_inventory_calculation.dc_code = demand.dc_code
AND live_inventory_calculation.date = demand.date
GROUP BY live_inventory_calculation.csku_id, live_inventory_calculation.dc_code,
live_inventory_calculation.production_week;
This is the case with every join with live_inventory_calculation
table. I have refreshed the stats so query planner has the correct statistics.
Here is the plan for both
Upvotes: 1
Views: 1356
Reputation: 10048
To answer your questions:
The query planner is choosing the nested loop
because the two sides of the join are one-large (3M rows), one-supersmall (0 rows). That would logically be the best way to join, because it takes extra time to sort-and-merge (Merge Join) or to hash-and-search (Hash Join).
Without knowing what data lives in the live_inventory_calculation
table, I suppose based on your objections that there should be a lot of matching rows, not zero. It seems that the memoize cache does not have the required data (Misses: 3009118
) so it needs to go to disk (or elsewhere) to retrieve the actual data.
When you set enable_nestedloop = off
, it decides that joining with the memoize
cache is too expensive, so it chooses the next best option, which is the Hash Join - this ends up being quick (but if the memoize cache was better, the original plan would probably faster).
It would be good to know which version of Postgres you're working with -- there have been many bug fixes related to memoize in the past several years.
Upvotes: 1