systemaddict
systemaddict

Reputation: 33

How to return a single group from a panda dataframe

I have a dataframe with goal scorers and I would like to extract the top scoring group into an array. This group can contain more than one items (in the example below there are two players with 8 goals).

So in the example below it would result in an array like this:

[{'goals': 8, 'name': 'Sergio Agüero', 'team': 'Manchester City'}, {'goals': 8, 'name': 'Tammy Abraham', 'team': 'Chelsea'}]

import pandas as pd 

data = [
    {
        "name": "Sergio Ag\u00fcero",
        "team": "Manchester City",
        "goals": "8"
    },
    {
        "name": "Tammy Abraham",
        "team": "Chelsea",
        "goals": "8"
    },
    {
        "name": "Pierre-Emerick Aubameyang",
        "team": "Arsenal",
        "goals": "7"
    },
    {
        "name": "Raheem Sterling",
        "team": "Manchester City",
        "goals": "6"
    },
    {
        "name": "Teemu Pukki",
        "team": "Norwich",
        "goals": "6"
    }
]

top_scorers = pd.DataFrame(data, columns=["name", "team", "goals"])

top_scoring_group = top_scorers.groupby("goals")

Upvotes: 0

Views: 84

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150765

IIUC,

(top_scorers[top_scorers['goals'].eq(top_scorers['goals'].max())]
     .to_dict('rows')
)

Output:

[{'name': 'Sergio Agüero', 'team': 'Manchester City', 'goals': '8'},
 {'name': 'Tammy Abraham', 'team': 'Chelsea', 'goals': '8'}]

Upvotes: 2

89f3a1c
89f3a1c

Reputation: 1488

top_scoring_group = top_scorers.groupby("team", as_index=False)['goals'].sum().nlargest(1, 'goals', keep='all')['team']

This will get the teams with most goals, and keep them all if there are more than one.

Upvotes: 0

Related Questions