Reputation: 693

Dataframe group and sort

I have a dataframe that roughly looks like this:

I could group and sort them with df_players.groupby(['vehicle']).agg({'frags':sum}).sort_values(['frags'],ascending=False)

How can I group the vehicles and get their win rates just like this:

EDIT: (data)

[
    {"player": "bob", "vehicle": "a", "hasWon": True, "frags": 5},
    {"player": "foo", "vehicle": "b", "hasWon": False, "frags": 3},
    {"player": "bar", "vehicle": "c", "hasWon": True, "frags": 2}
]

Upvotes: 2

Answers (3)

CypherX

Reputation: 7353

Solution

# Assuming 
#     winrate = sum(hasWon*frags)/sum(frags)
# df['winrate'] = df['hasWon']*df['frags']/df['frags'].sum()

# Calculate Aggregate Result
result = (df.groupby(['vehicle'])
            .agg({
                'frags': sum, 
                'winrate': sum
            })

Dummy Data

import numpy as np
import pandas as pd

df = pd.DataFrame({'player': ['bob', 'foo', 'bar', 'foo2', 'bar2'], 
                   'vehicle': list('abcab'), 
                   'hasWon': [True, False, True, True, True], 
                   'frags': [5, 3, 2, 4, 2]})
# Assuming 
#     winrate = sum(hasWon*frags)/sum(frags)
df['winrate'] = df['hasWon']*df['frags']/df['frags'].sum()

df

Upvotes: 1

Serge Ballesta

Reputation: 148880

I would just assign a new column to just use a mean:

df.assign(winrate=np.where(df['hasWon'], 100, 0)).groupby('vehicle').agg({'winrate': 'mean'})

Upvotes: 1

jezrael

Reputation: 862521

I think you need aggregate mean of hasWon column, because Trues are processing like 1, then multiple by 100, round and last convert to column:

df_players = pd.DataFrame({'vehicle': list('aabbccc'), 
                           'hasWon':[True, False, True, True, False, True, False], 
                           'frags':[5,2,3,6,5,4,5]})
print (df_players)
  vehicle  hasWon  frags
0       a    True      5
1       a   False      2
2       b    True      3
3       b    True      6
4       c   False      5
5       c    True      4
6       c   False      5

df = (df_players.groupby(['vehicle'])['hasWon']
                 .mean()
                 .mul(100)
                 .round(2)
                 .reset_index(name='winrate'))
print (df)
  vehicle  winrate
0       a    50.00
1       b   100.00
2       c    33.33

Upvotes: 1

Dataframe group and sort

Answers (3)

Solution

Dummy Data

Related Questions