Average values with Pandas GroupBy

Question

I have a data frame similar to this:

name        |   points     |    tries    |        game
Alfred      |      3.4     |      2      |      sudoku
Alfred      |      3       |      1      |      sudoku
Alfred      |      2.1     |      4      |      tetris
Barry       |      2.7     |      3      |      tetris
Barry       |      1.1     |      2      |      sudoku
Cathy       |      4.1     |      2      |      tetris
Cathy       |      3.3     |      2      |      tetris

I am trying to first group by Games and inside every different "game", group by "name" and calculate total "points", total "tries", and the average points per try.

I have been reading about groupby but I don't find the day to do all this and at the same time calculate points/try for each group.

Any help would be appreciated.

This is for trying to analyze more easily some csv files. I have been able to open the data, read it and do some simple groupby commands but this multiple selection and average calculation is driving me crazy.

Code Different · Accepted Answer

If you are on pandas 0.25 or later and want to play with thew new NamedAgg:

result = df.groupby(['game', 'name']).agg(
    total_points = pd.NamedAgg('points', 'sum'),
    total_tries = pd.NamedAgg('tries', 'sum')
)
result['avg_point_per_try'] = result['total_points'] / result['total_tries']

If you are on pandas < 0.25, you just have to rename the columns manually:

result = df.groupby(['game', 'name']).sum()
result.columns = ['total_points', 'total_tries']
result['avg_point_per_try'] = result['total_points'] / result['total_tries']

Result:

               total_points  total_tries  avg_point_per_try
game   name                                                
sudoku Alfred           6.4            3           2.133333
       Barry            1.1            2           0.550000
tetris Alfred           2.1            4           0.525000
       Barry            2.7            3           0.900000
       Cathy            7.4            4           1.850000

Average values with Pandas GroupBy

Answers (1)

Related Questions