user19446980
user19446980

Reputation: 43

Pandas - head to head score counting

I have a tennis data frame and I am trying to get some statistics from the data. I would like to be able to see the head to head score of the players whether they are player 1 or player 2 and also the same score but on each surface. It will have to have a shift(1) as I want to check the score before the game starts

This is currently what I have

p1_name p2_name Surface p1_win p2_win
Murray Nadal Hard 1 0
Nadal Murray Clay 1 0
Nadal Murray Hard 0 1
Murray Federer Clay 1 0

and this is what I would want with possibly a p1/2 h2h Surface stat

p1_name p2_name Surface p1_win p2_win p1_h2h p2_h2h
Murray Nadal Hard 1 0 na na
Nadal Murray Clay 1 0 0% 100%
Nadal Murray Hard 0 1 50% 50%
Murray Federer Clay 1 0 na na

Any help is much appreciated.

Upvotes: 3

Views: 100

Answers (1)

Tim Roberts
Tim Roberts

Reputation: 54698

I think this can be a good a starting place for you.

import pandas as pd

data = [
    [ 'Murray', 'Nadal', 'Hard', 1, 0 ],
    [ 'Nadal', 'Murray', 'Clay', 1, 0 ],
    [ 'Nadal', 'Murray', 'Hard', 0, 1 ],
    [ 'Murray', 'Federer', 'Clay', 1, 0 ]
]

df = pd.DataFrame( data, columns=['p1_name', 'p2_name', 'Surface', 'p1_win', 'p2_win'] )
print(df)
df1 = df.rename(columns={'p1_name':'p2_name','p2_name':'p1_name','p1_win':'p2_win','p2_win':'p1_win'})
print(df1)
df2 = pd.concat( [df, df1] )
print(df2)
df3 = df2.groupby(['p1_name','p2_name','Surface']).sum()
print(df3)
df3['p1_pct'] = df3['p1_win'] / (df3['p1_win']+df3['p2_win'])
df3['p2_pct'] = df3['p2_win'] / (df3['p1_win']+df3['p2_win'])
print(df3)

Output:

  p1_name  p2_name Surface  p1_win  p2_win
0  Murray    Nadal    Hard       1       0
1   Nadal   Murray    Clay       1       0
2   Nadal   Murray    Hard       0       1
3  Murray  Federer    Clay       1       0
  p2_name  p1_name Surface  p2_win  p1_win
0  Murray    Nadal    Hard       1       0
1   Nadal   Murray    Clay       1       0
2   Nadal   Murray    Hard       0       1
3  Murray  Federer    Clay       1       0
   p1_name  p2_name Surface  p1_win  p2_win
0   Murray    Nadal    Hard       1       0
1    Nadal   Murray    Clay       1       0
2    Nadal   Murray    Hard       0       1
3   Murray  Federer    Clay       1       0
0    Nadal   Murray    Hard       0       1
1   Murray    Nadal    Clay       0       1
2   Murray    Nadal    Hard       1       0
3  Federer   Murray    Clay       0       1
                         p1_win  p2_win
p1_name p2_name Surface                
Federer Murray  Clay          0       1
Murray  Federer Clay          1       0
        Nadal   Clay          0       1
                Hard          2       0
Nadal   Murray  Clay          1       0
                Hard          0       2
                         p1_win  p2_win  p1_pct  p2_pct
p1_name p2_name Surface
Federer Murray  Clay          0       1     0.0     1.0
Murray  Federer Clay          1       0     1.0     0.0
        Nadal   Clay          0       1     0.0     1.0
                Hard          2       0     1.0     0.0
Nadal   Murray  Clay          1       0     1.0     0.0
                Hard          0       2     0.0     1.0

Upvotes: 2

Related Questions