How to count number of rows per group greater than the average of that group in pandas group by?

Question

I have a data frame df and I use several columns from it to groupby:

df[['col1', 'col2', 'col3', 'col4']].groupby(['col1', 'col2']).agg(['mean', 'count'])

In the above way I get the table (data frame) that I need. What is missing is an additional column that contains number of rows in each group greater than mean(average) of that group. in fact I want something like:

col1  col2 mean   count  above_Mean
A      E    4      10     6    
       F    3      20     4 
B      G    7      30     15

I tried above return command which gives me everything except the above mean count of per group.

df[['col1', 'col2', 'col3', 'col4']].groupby(['col1', 'col2']).agg(['mean', 'count'])

output:

col1  col2 mean   count  above_Mean
A      E    4      10     6    
       F    3      20     4 
B      G    7      30     15

yatu · Accepted Answer

You could also aggregate using a lambda function as the following:

df.groupby(['col1', 'col2']).agg(['mean', 'count', 
                                  lambda x: (x > x.mean()).sum()])

How to count number of rows per group greater than the average of that group in pandas group by?

Answers (1)

Related Questions