Reputation: 508
I'm having some trouble applying a transform to a 2 column groupby in Pandas. I've tried a number of things referencing similar use cases.
I'm looking to groupby by date and user and do a transform on a flag column by saying if 'nan' then 0 else 1. My data looks like this:
user date Flag
0 ron 12/21/2019 1
1 ron 12/22/2019 2
2 april 12/21/2016 nan
3 april 12/23/2016 1
4 andy 12/21/2016 nan
Here's what I've setup, which logically makes sense to me but I get a keyerror.
s = master['Flag'].eq('nan').groupby(master['date','user']).transform('any')
master.loc[:,'attendance'] = s.map({True:0,False: 1})
KeyError: ('date', 'user')
Upvotes: 0
Views: 966
Reputation: 1126
After master['Flag'].eq('nan')
you have just Series type. Then you call .groupby
and should pass columns for grouping (but there is no such columns there).
If i have correctly understood whole task, here is the code:
# step 1
master['Flag'] = master['Flag'] == 'nan'
master
Out[1]:
user date Flag
0 ron 12/21/2019 False
1 ron 12/22/2019 False
2 april 12/21/2016 True
3 april 12/23/2016 False
4 andy 12/21/2016 True
# step 2
s = master.groupby(['date','user']).agg('any')
s
Out[2]:
Flag
date user
12/21/2016 andy True
april True
12/21/2019 ron False
12/22/2019 ron False
12/23/2016 april False
# step 3
s['attendance'] = s['Flag'].map({True:0,False: 1})
s
Out[3]:
Flag attendance
date user
12/21/2016 andy True 0
april True 0
12/21/2019 ron False 1
12/22/2019 ron False 1
12/23/2016 april False 1
master.assign(flg = master['Flag'] == 'nan').groupby(['date','user'])[['flg']].agg('any')['flg'].map({True:0,False: 1}).to_frame()
Upvotes: 1