Reputation: 103
I have a dataframe similar to this:
id name status output
123 John a 33.33%
232 Wang b 50%
324 Wang a 50%
424 Cici a 100%
553 John b 33.33%
653 John b 33.33%
I need to 1) groupby name 2) count the percentage where status == a. Output is listed on the right
I used following code:
df['output'] = df.groupby('name')['id'].transform(lambda x: x[x['status == a']].count/len(x))
but the return was wrong, is there anyway I can fix this.
Upvotes: 1
Views: 54
Reputation: 323356
Try with crosstab
df['out'] = df.name.map(pd.crosstab(df.name,df.status,normalize='index')['a'])
df
Out[60]:
id name status output out
0 123 John a 33.33% 0.333333
1 232 Wang b 50% 0.500000
2 324 Wang a 50% 0.500000
3 424 Cici a 100% 1.000000
4 553 John b 33.33% 0.333333
5 653 John b 33.33% 0.333333
Upvotes: 0
Reputation: 150785
Also try this without a lambda function:
df['output'] = df['status'].eq('a').groupby(df['name']).transform('mean')
Upvotes: 2
Reputation: 967
use
df['output'] = df.groupby('name')['status'].transform(lambda x: (x=='a').mean())
You have some syntax mistakes in comparing and count
is a function not attribute
Upvotes: 0