Danish
Danish

Reputation: 2871

Groupby count with less than a particular value of a numerical column

I have a data frame as shown below

ID    Class    Score1       Score2   Name
1     A        9            7        Xavi
2     B        7            8        Alba
3     A        10           8        Messi
4     A        8            10       Neymar
5     A        7            8        Mbappe
6     C        4            6        Silva
7     C        3            2        Pique
8     B        5            7        Ramos
9     B        6            7        Serge
10    C        8            5        Ayala

From the above, I would like to calculate the number of players with scoer1 less than or equal to 6 in each group

Expected Output:

Class     Total_Number     Score1_less_than_6_#    Avg_score1  
A         4                0                       8.5 
B         3                2                       6 
C         3                2                       5

Upvotes: 1

Views: 36

Answers (1)

jezrael
jezrael

Reputation: 862691

For improve performance first create helper column filled values by Series.between and then aggregate by named aggregation:

df = (df.assign(new = df['Score1'].between(1,6))
       .groupby('Class').agg(Total_Number = ('Score1','size'),
                             Score1_less_than_6 = ('Score1', 'sum'),
                             Avg_score1 = ('Score1','mean')))

Or use custom lambda function - slowier if large dataframe:

df = df.groupby('Class').agg(Total_Number = ('Score1','size'),
                             Score1_less_than_6 = ('Score1',lambda x: x.between(1,6).sum()),
                             Avg_score1 = ('Score1','mean'))
print (df)
       Total_Number  Score1_less_than_6  Avg_score1
Class                                              
A                 4                   0         8.5
B                 3                   2         6.0
C                 3                   2         5.0

Upvotes: 1

Related Questions