Reputation: 4610
I am computing a DataFrame with weekly amounts and now I need to fill it with missing weeks from a provided date range.
This is how I'm generating the dataframe with the weekly amounts:
df['date'] = pd.to_datetime(df['date']) - timedelta(days=6)
weekly_data: pd.DataFrame = (df
.groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type]
.sum()
.reset_index()
)
Which outputs:
date sum
0 2020-10-11 78
1 2020-10-18 673
If a date range is given as start='2020-08-30'
and end='2020-10-30'
, then I would expect the following dataframe:
date sum
0 2020-08-30 0.0
1 2020-09-06 0.0
2 2020-09-13 0.0
3 2020-09-20 0.0
4 2020-09-27 0.0
5 2020-10-04 0.0
6 2020-10-11 78
7 2020-10-18 673
8 2020-10-25 0.0
So far, I have managed to just add the missing weeks and set the sum to 0, but it also replaces the existing values:
weekly_data = weekly_data.reindex(pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN')).fillna(0)
Which outputs:
date sum
0 2020-08-30 0.0
1 2020-09-06 0.0
2 2020-09-13 0.0
3 2020-09-20 0.0
4 2020-09-27 0.0
5 2020-10-04 0.0
6 2020-10-11 0.0 # should be 78
7 2020-10-18 0.0 # should be 673
8 2020-10-25 0.0
Upvotes: 2
Views: 286
Reputation: 862581
Remove reset_index
for DatetimeIndex
, because reindex
working with index and if RangeIndex
get 0
values, because no match:
weekly_data = (df.groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type]
.sum()
)
Then is possible use fill_value=0
parameter and last add reset_index
:
r = pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN', name='date')
weekly_data = weekly_data.reindex(r, fill_value=0).reset_index()
print (weekly_data)
date sum
0 2020-08-30 0
1 2020-09-06 0
2 2020-09-13 0
3 2020-09-20 0
4 2020-09-27 0
5 2020-10-04 0
6 2020-10-11 78
7 2020-10-18 673
8 2020-10-25 0
Upvotes: 1