Reputation: 5660
I have a Pandas dataframe where for certain dates
certain products
are missing. I want to add those rows to the dataframe and assign them a sales
value of 0. How can I do that?
# Sample dataframe
import pandas as pd
df = pd.DataFrame({
'date': ['2020-01-01', '2020-01-01', '2020-01-01', '2020-01-02', '2020-01-02', '2020-01-03', '2020-01-03'],
'product': ['glass', 'clothes', 'food', 'glass', 'food', 'glass', 'clothes'],
'sales': [100, 120, 50, 90, 60, 110, 130]
})
date product sales
0 2020-01-01 glass 100
1 2020-01-01 clothes 120
2 2020-01-01 food 50
3 2020-01-02 glass 90
4 2020-01-02 food 60
5 2020-01-03 glass 110
6 2020-01-03 clothes 130
## 'clothes' is missing for 2020-01-02 and 'food' is missing for 2020-01-03
## What I want to get:
date product sales
0 2020-01-01 glass 100
1 2020-01-01 clothes 120
2 2020-01-01 food 50
3 2020-01-02 glass 90
4 2020-01-02 clothes 0
5 2020-01-02 food 60
6 2020-01-03 glass 110
7 2020-01-03 clothes 130
8 2020-01-03 food 0
Upvotes: 0
Views: 42
Reputation: 153460
Use set_index
with reindex
:
(df.set_index(['date', 'product'])
.reindex(pd.MultiIndex.from_product([df['date'].unique(),
df['product'].unique()],
names=['date', 'product']),
fill_value=0)
.reset_index())
Output:
date product sales
0 2020-01-01 glass 100
1 2020-01-01 clothes 120
2 2020-01-01 food 50
3 2020-01-02 glass 90
4 2020-01-02 clothes 0
5 2020-01-02 food 60
6 2020-01-03 glass 110
7 2020-01-03 clothes 130
8 2020-01-03 food 0
Upvotes: 1
Reputation: 323226
Try with pivot
df=df.pivot(*df.columns).fillna(0).stack().to_frame('sales').reset_index()
df
Out[120]:
date product sales
0 2020-01-01 clothes 120.0
1 2020-01-01 food 50.0
2 2020-01-01 glass 100.0
3 2020-01-02 clothes 0.0
4 2020-01-02 food 60.0
5 2020-01-02 glass 90.0
6 2020-01-03 clothes 130.0
7 2020-01-03 food 0.0
8 2020-01-03 glass 110.0
Upvotes: 2
Reputation: 150735
You can do with unstack()/stack()
:
(df.set_index(['date','product'])
.unstack(fill_value=0)
.stack()
.reset_index()
)
Output:
date product sales
0 2020-01-01 clothes 120
1 2020-01-01 food 50
2 2020-01-01 glass 100
3 2020-01-02 clothes 0
4 2020-01-02 food 60
5 2020-01-02 glass 90
6 2020-01-03 clothes 130
7 2020-01-03 food 0
8 2020-01-03 glass 110
Upvotes: 2