Reputation: 485
I am trying to group row by different condition and here is the example. Basically, what I want to try is group the Team
name and put it into a new Dataframe
with the sum of goal
. I have try groupby
but somehow cannot do what I want. How can I get the expected result? Thanks!
example = {'Team':['Arsenal', 'Manchester United', 'Arsenal',
'Arsenal', 'Chelsea', 'Manchester United',
'Manchester United', 'Chelsea', 'Chelsea', 'Chelsea',
'Juventus','Juventus'],
'Player':['Ozil', 'Pogba', 'Lucas', 'Aubameyang',
'Hazard', 'Mata', 'Lukaku', 'Morata',
'Giroud', 'Kante',
'Ronaldo','Buffon'],
'Goals':[5, 3, 6, 4, 9, 2, 0, 5, 2, 3, 20, 0] }
group_dict = {'UK':['Arsenal', 'Manchester United', 'Chelsea'], 'Italy':['Juventus']}
Expected Result:
Country Goals
UK 39
Italy 20
Upvotes: 1
Views: 330
Reputation: 18426
You can use np.select
to assign a new column as contry
temporarily and then groupby contry and call sum
df.assign(contry=np.select([df['Team'].isin(v) for v in group_dict.values()],
list(group_dict.keys()),
'')).groupby('contry', sort=False, as_index=False)['Goals'].sum()
OUTPUT:
contry Goals
0 UK 39
1 Italy 20
Upvotes: 2
Reputation: 215067
Create a dictionary that reverse map from Team
to Country
and then aggregate by Country
:
df = pd.DataFrame(example)
df.Goals.groupby(
df.Team.map({v: k for k, lst in group_dict.items() for v in lst}).rename('Country')
).sum().reset_index()
# Country Goals
#0 Italy 20
#1 UK 39
Upvotes: 4