Reputation: 1097
Say I have a dataset like
z = pd.DataFrame({'match': [1,1,2,2,3,3,4,4],
'group': [0,1,1,2,0,2,1,2],
'win': [0,1,1,0,0,1,0,1]})
Notice, for example, that group "1" and "2" are paired up twice (in match 2 and 4), and in this matchup, group 2 has a 50% win rate.
I want an array that shows the avg win rate of each group pairing, something like
0 1 2
0 - 0.0 0.0
1 1.0 - 0.5
2 1.0 0.5 -
but I have no idea how to get there with standard groupby
and agg
methods.
Upvotes: 0
Views: 559
Reputation: 215137
Option one (Updated):
left = z.assign(
_id = lambda x: x.groupby('match').cumcount()
).pivot('match', '_id', 'group').join(
z.groupby('match').win.first()
)
# for each match create another row with players reversed and assign
# the opposite score
right = left.rename(columns={0: 1, 1:0}).assign(win=lambda x: 1-x.win)
left.append(right).groupby([0,1]).win.mean().unstack()
1 0 1 2
0
0 NaN 0.0 0.0
1 1.0 NaN 0.5
2 1.0 0.5 NaN
Option two (original answer):
apply
to calculate all pair wise win rates;p = z.pivot('match', 'group', 'win')
# calculate the win rate of x over y
def win_rate(x, y):
if x.name == y.name:
return None
return x[x.notnull() & y.notnull()].mean()
# calculate pair wise win rate
p.apply(lambda x: p.apply(lambda y: win_rate(y, x)))
group 0 1 2
group
0 NaN 0.0 0.0
1 1.0 NaN 0.5
2 1.0 0.5 NaN
Upvotes: 2