Reputation: 4398
I have a dataframe, where I would like to separate column values into columns and sum the categories, grouping by 2 columns.
Data
tar date ex ppr stat
aa q122 bb 10 hi
aa q122 bb 15 hello
aa q122 bb 10 hi
cc q222 zz 20 hey
cc q222 zz 15 hello
Desired
tar date count ppr_sum hi hello hey
aa q122 3 35 2 1 0
cc q222 2 35 0 1 1
Doing
df1 = pd.crosstab(index = [df['tar'], df['date']], columns = df['stat'])
ppr_sum = df1.groupby(['tar','date'])['ppr'].sum()
Still not obtaining the sum of the stat column. Any suggestion is appreciated
Upvotes: 1
Views: 773
Reputation: 71570
Try with get_dummies
and groupby
:
>>> pd.get_dummies(df, prefix_sep='', prefix='', columns=['stat']).groupby(['tar', 'date'], as_index=False).sum().assign(count=df.groupby(['tar', 'date'], as_index=False)['ex'].size()['size'])
tar date ppr hello hey hi count
0 aa q122 35 1 0 2 3
1 cc q222 35 1 1 0 2
>>>
Upvotes: 2