Reputation: 103

groupby and then count by status

I have a dataframe similar to this:

id    name  status output
123    John   a      33.33%
232    Wang   b      50%
324    Wang   a      50%
424    Cici   a      100%
553    John   b      33.33%
653    John   b      33.33%

I need to 1) groupby name 2) count the percentage where status == a. Output is listed on the right

I used following code:

df['output'] = df.groupby('name')['id'].transform(lambda x: x[x['status == a']].count/len(x))

but the return was wrong, is there anyway I can fix this.

Upvotes: 1

Answers (3)

BENY

Reputation: 323356

Try with crosstab

df['out'] = df.name.map(pd.crosstab(df.name,df.status,normalize='index')['a'])
df
Out[60]: 
    id  name status  output       out
0  123  John      a  33.33%  0.333333
1  232  Wang      b     50%  0.500000
2  324  Wang      a     50%  0.500000
3  424  Cici      a    100%  1.000000
4  553  John      b  33.33%  0.333333
5  653  John      b  33.33%  0.333333

Upvotes: 0

Quang Hoang

Reputation: 150785

Also try this without a lambda function:

df['output'] = df['status'].eq('a').groupby(df['name']).transform('mean')

Upvotes: 2

Raghava Dhanya

Reputation: 967

use

df['output'] = df.groupby('name')['status'].transform(lambda x: (x=='a').mean())

You have some syntax mistakes in comparing and count is a function not attribute

Upvotes: 0

groupby and then count by status

Answers (3)

Related Questions