Reputation: 99
I realize this is a very simple question but infuriatingly I am unable to figure this out. I want to group by position and return the maximum grandtotal value for each distinct position. This is easy. However, I cannot find a way to also include the player column in addition in the resulting dataframe. How can I create a dataframe that includes the one player with the max score for each position?
Given dataframe result:
result = pd.DataFrame([
{'position': 'Top', 'grandtotal': 36.72, 'player': 'Joe'},
{'position': 'Top', 'grandtotal': 101.18, 'player': 'Bill'},
{'position': 'Middle', 'grandtotal': 33.32, 'player': 'Jim'},
{'position': 'Middle', 'grandtotal': 72.62, 'player': 'Him'},
{'position': 'Bottom', 'grandtotal': 42.61, 'player': 'Tim'},
{'position': 'Bottom', 'grandtotal': 83.98, 'player': 'Slim'},])
I can return the position and the grandtotal with:
playerframe = result.groupby(['player', 'position']).sum().reset_index()
Giving me:
Top 101.18
Middle 72.62
Bottom 83.98
I just need the top scoring player's name included as well! It seems like it should be so simple. I'd like:
Top Bill 101.18
Middle Him 72.62
Bottom Slim 83.98
Upvotes: 3
Views: 2337
Reputation: 323226
By using sort_values
and groupby
agg
result.sort_values('grandtotal').groupby(['position'],as_index=False).agg({'player':'last','grandtotal':'sum'})
Out[345]:
position grandtotal player
0 Bottom 126.59 Slim
1 Middle 105.94 Him
2 Top 137.90 Bill
Based on your description, if only need the max value row for each group:
result.sort_values('grandtotal').drop_duplicates(['position'],keep='last')
Out[347]:
grandtotal player position
3 72.62 Him Middle
5 83.98 Slim Bottom
1 101.18 Bill Top
Upvotes: 2
Reputation: 51165
loc
with groupby
and idxmax
df.loc[df.groupby('position')['grandtotal'].idxmax()]
grandtotal player position
5 83.98 Slim Bottom
3 72.62 Him Middle
1 101.18 Bill Top
Upvotes: 2