Reputation: 65
I have a data frame similar to this:
name | points | tries | game
Alfred | 3.4 | 2 | sudoku
Alfred | 3 | 1 | sudoku
Alfred | 2.1 | 4 | tetris
Barry | 2.7 | 3 | tetris
Barry | 1.1 | 2 | sudoku
Cathy | 4.1 | 2 | tetris
Cathy | 3.3 | 2 | tetris
I am trying to first group by Games and inside every different "game", group by "name" and calculate total "points", total "tries", and the average points per try.
I have been reading about groupby but I don't find the day to do all this and at the same time calculate points/try for each group.
Any help would be appreciated.
This is for trying to analyze more easily some csv files. I have been able to open the data, read it and do some simple groupby commands but this multiple selection and average calculation is driving me crazy.
Upvotes: 1
Views: 163
Reputation: 93161
If you are on pandas 0.25 or later and want to play with thew new NamedAgg
:
result = df.groupby(['game', 'name']).agg(
total_points = pd.NamedAgg('points', 'sum'),
total_tries = pd.NamedAgg('tries', 'sum')
)
result['avg_point_per_try'] = result['total_points'] / result['total_tries']
If you are on pandas < 0.25, you just have to rename the columns manually:
result = df.groupby(['game', 'name']).sum()
result.columns = ['total_points', 'total_tries']
result['avg_point_per_try'] = result['total_points'] / result['total_tries']
Result:
total_points total_tries avg_point_per_try
game name
sudoku Alfred 6.4 3 2.133333
Barry 1.1 2 0.550000
tetris Alfred 2.1 4 0.525000
Barry 2.7 3 0.900000
Cathy 7.4 4 1.850000
Upvotes: 2