Reputation: 433
My dataframe is looking something like this:
date | status | value
0 2020-01 | true | 3
1 2020-01 | true | 8
2 2020-02 | true | 5
3 2020-01 | false | 1
4 2020-01 | false | 11
For most functions I would call groupby
...
df_group = df.groupby(['date', 'status'])[['value']].agg('sum').reset_index()
...and obtain this:
date | status | value
0 2020-01 | true | 11
1 2020-02 | true | 5
2 2020-01 | false | 12
But what I'm trying to reach is a dataframe, where also the 'missing' values are listed, like this:
date | status | value
0 2020-01 | true | 11
1 2020-02 | true | 5
2 2020-01 | false | 12
3 2020-02 | false | 0
Is groupby
even the correct funtion?
Upvotes: 0
Views: 552
Reputation: 863166
Use Series.unstack
with DataFrame.stack
, also remove [[]]
and instead agg
is possible use only sum
:
df_group = (df.groupby(['date', 'status'])['value']
.sum()
.unstack(fill_value=0)
.stack()
.reset_index(name='value'))
print (df_group)
date status value
0 2020-01 False 12
1 2020-01 True 11
2 2020-02 False 0
3 2020-02 True 5
Upvotes: 2