Reputation: 7644
I have a pandas dataframe
age gender criticality acknowledged
10 Male High Yes
10 Male High Yes
10 Male High Yes
10 Male Low Yes
11 Female Medium No
I want to groupby age and gender, and then put values of 'criticality', 'acknowledged' as new columns and get the count.
For eg output i desire is:
criticality acknowledged
age gender High Medium Low Yes No
10 Male 3 0 1 4 0
11 Female 0 1 0 0 1
I thought of using df.groupby(['age','gender'])['criticality','acknowledged'].stack()
But its not working.
Is there a better way to get results in this format
Upvotes: 2
Views: 539
Reputation: 75150
Another way using get_dummies()
with groupby()
after assigning
and finally split the columns with expand=True
for Multiindex:
l=['criticality','acknowledged']
final=df[['age','gender']].assign(**pd.get_dummies(df[l])).groupby(['age','gender']).sum()
final.columns=final.columns.str.split('_',expand=True)
print(final)
criticality acknowledged
High Low Medium No Yes
age gender
10 Male 3 1 0 0 4
11 Female 0 0 1 1 0
Upvotes: 1
Reputation: 6642
Since you are counting for the two columns separately, a concat would be an easy solution:
In [13]: pd.concat([df.pivot_table(index=['age', 'gender'], columns=col, aggfunc
...: =len) for col in ['criticality', 'acknowledged']], axis=1).fillna(0)
Out[13]:
acknowledged criticality
criticality High Low Medium No Yes
age gender
10 Male 3.0 1.0 0.0 0.0 4.0
11 Female 0.0 0.0 1.0 1.0 0.0
Upvotes: 2