Reputation: 377
I have one table containing stock per day (stock) and another containing all the locations (locations). The goods tables only has locations that are filled, but I also need empty shelves for calculating averages and such. That is why I want to join both dataframes, so that I also get the empty ones).
The locations table looks like:
Locations
A
B
C
The stock table looks like:
Date Location quantity
2021-01-01 A 5
2021-01-01 B 5
2021-01-01 A 5
2021-01-02 A 5
2021-01-02 A 5
What I would like:
Date Location quantity
2021-01-01 A 5
2021-01-01 B 5
2021-01-01 A 5
2021-01-01 C 0 <-- new because on 01-01, there was no C
2021-01-02 A 5
2021-01-02 A 5
2021-01-02 B 0 <-- new because on 01-02, there was no C
2021-01-02 C 0 <-- new because on 01-02, there was no C
There are more than a million rows in the locations table alone. What further makes it complicated are the duplications (multiple products in the same location).
Upvotes: 1
Views: 1481
Reputation: 172993
Consider below approach (less joins ...)
select date, location, sum(quantity) as quantity
from (
select date, location, quantity
from stock
union all
select date, location, 0 as quantity
from (select distinct date from stock), locations
)
group by date, location
if applied to sample data in your question - output is
Upvotes: 1
Reputation: 1269803
Use a cross join
to generate the rows and left join
to bring in the data:
select d.date, l.location, coalesce(s.quantity, 0)
from (select distinct date from stock) d cross join
locations l left join
stock s
on s.date = d.date and s.location = l.location;
You may have another source of dates, or can generate them using an array.
Note: This construct is often used to return exactly one row per location and date:
select d.date, l.location, coalesce(sum(s.quantity), 0)
from (select distinct date from stock) d cross join
locations l left join
stock s
on s.date = d.date and s.location = l.location
group by 1, 2;
Upvotes: 1