Reputation: 43
I have a tennis data frame and I am trying to get some statistics from the data. I would like to be able to see the head to head score of the players whether they are player 1 or player 2 and also the same score but on each surface. It will have to have a shift(1) as I want to check the score before the game starts
This is currently what I have
p1_name | p2_name | Surface | p1_win | p2_win |
---|---|---|---|---|
Murray | Nadal | Hard | 1 | 0 |
Nadal | Murray | Clay | 1 | 0 |
Nadal | Murray | Hard | 0 | 1 |
Murray | Federer | Clay | 1 | 0 |
and this is what I would want with possibly a p1/2 h2h Surface stat
p1_name | p2_name | Surface | p1_win | p2_win | p1_h2h | p2_h2h |
---|---|---|---|---|---|---|
Murray | Nadal | Hard | 1 | 0 | na | na |
Nadal | Murray | Clay | 1 | 0 | 0% | 100% |
Nadal | Murray | Hard | 0 | 1 | 50% | 50% |
Murray | Federer | Clay | 1 | 0 | na | na |
Any help is much appreciated.
Upvotes: 3
Views: 100
Reputation: 54698
I think this can be a good a starting place for you.
import pandas as pd
data = [
[ 'Murray', 'Nadal', 'Hard', 1, 0 ],
[ 'Nadal', 'Murray', 'Clay', 1, 0 ],
[ 'Nadal', 'Murray', 'Hard', 0, 1 ],
[ 'Murray', 'Federer', 'Clay', 1, 0 ]
]
df = pd.DataFrame( data, columns=['p1_name', 'p2_name', 'Surface', 'p1_win', 'p2_win'] )
print(df)
df1 = df.rename(columns={'p1_name':'p2_name','p2_name':'p1_name','p1_win':'p2_win','p2_win':'p1_win'})
print(df1)
df2 = pd.concat( [df, df1] )
print(df2)
df3 = df2.groupby(['p1_name','p2_name','Surface']).sum()
print(df3)
df3['p1_pct'] = df3['p1_win'] / (df3['p1_win']+df3['p2_win'])
df3['p2_pct'] = df3['p2_win'] / (df3['p1_win']+df3['p2_win'])
print(df3)
Output:
p1_name p2_name Surface p1_win p2_win
0 Murray Nadal Hard 1 0
1 Nadal Murray Clay 1 0
2 Nadal Murray Hard 0 1
3 Murray Federer Clay 1 0
p2_name p1_name Surface p2_win p1_win
0 Murray Nadal Hard 1 0
1 Nadal Murray Clay 1 0
2 Nadal Murray Hard 0 1
3 Murray Federer Clay 1 0
p1_name p2_name Surface p1_win p2_win
0 Murray Nadal Hard 1 0
1 Nadal Murray Clay 1 0
2 Nadal Murray Hard 0 1
3 Murray Federer Clay 1 0
0 Nadal Murray Hard 0 1
1 Murray Nadal Clay 0 1
2 Murray Nadal Hard 1 0
3 Federer Murray Clay 0 1
p1_win p2_win
p1_name p2_name Surface
Federer Murray Clay 0 1
Murray Federer Clay 1 0
Nadal Clay 0 1
Hard 2 0
Nadal Murray Clay 1 0
Hard 0 2
p1_win p2_win p1_pct p2_pct
p1_name p2_name Surface
Federer Murray Clay 0 1 0.0 1.0
Murray Federer Clay 1 0 1.0 0.0
Nadal Clay 0 1 0.0 1.0
Hard 2 0 1.0 0.0
Nadal Murray Clay 1 0 1.0 0.0
Hard 0 2 0.0 1.0
Upvotes: 2