Reputation: 647
I have an dataframe of football results, and am attempting to make a new column at the end of the dataframe which shows which team won. I'm attempting to do this using df.apply
. Here is what i have so far:
def match_winner(winner,home_team,away_team,home_goals,away_goals):
if home_goals>away_goals:
winner = home_team
elif home_goals<away_goals:
winner = away_team
else:
winner = "None"
match['Winning Team'] =""
match['Winning Team'].apply(match_winner,args=[match['Home Team'],match['Away Team'],match['home_team_goal'],match['away_team_goal']])
And here is the structure of the match
dataframe:
<bound method DataFrame.info of id country_id league_id season stage date \
145 146 1 1 2008/2009 24 2009-02-27 00:00:00
153 154 1 1 2008/2009 25 2009-03-08 00:00:00
155 156 1 1 2008/2009 25 2009-03-07 00:00:00
162 163 1 1 2008/2009 26 2009-03-13 00:00:00
168 169 1 1 2008/2009 26 2009-03-14 00:00:00
... ... ... ... ... ... ...
25972 25973 24558 24558 2015/2016 8 2015-09-13 00:00:00
25974 25975 24558 24558 2015/2016 9 2015-09-22 00:00:00
25975 25976 24558 24558 2015/2016 9 2015-09-23 00:00:00
25976 25977 24558 24558 2015/2016 9 2015-09-23 00:00:00
25978 25979 24558 24558 2015/2016 9 2015-09-23 00:00:00
match_api_id home_team_api_id away_team_api_id home_team_goal \
145 493017 8203 9987 2
153 493025 9984 8342 1
155 493027 8635 10000 2
162 493034 8203 8635 2
168 493040 10000 9999 0
... ... ... ... ...
25972 1992089 10243 10191 3
25974 1992091 10190 10191 1
25975 1992092 9824 10199 1
25976 1992093 9956 10179 2
25978 1992095 10192 9931 4
away_team_goal goal shoton shotoff foulcommit card cross corner \
145 1 None None None None None None None
153 3 None None None None None None None
155 0 None None None None None None None
162 1 None None None None None None None
168 0 None None None None None None None
... ... ... ... ... ... ... ... ...
25972 3 None None None None None None None
25974 0 None None None None None None None
25975 2 None None None None None None None
25976 0 None None None None None None None
25978 3 None None None None None None None
possession BSA Home Team Away Team \
145 None 2.25 KV Mechelen KRC Genk
153 None 2.38 KSV Cercle Brugge Club Brugge KV
155 None 7.00 RSC Anderlecht SV Zulte-Waregem
162 None 1.75 KV Mechelen RSC Anderlecht
168 None 4.33 SV Zulte-Waregem KSV Roeselare
... ... ... ... ...
25972 None NaN FC Zürich FC Thun
25974 None NaN FC St. Gallen FC Thun
25975 None NaN FC Vaduz FC Luzern
25976 None NaN Grasshopper Club Zürich FC Sion
25978 None NaN BSC Young Boys FC Basel
League Country home_player_1 home_player_2 \
145 Belgium Jupiler League Belgium False False
153 Belgium Jupiler League Belgium False False
155 Belgium Jupiler League Belgium False False
162 Belgium Jupiler League Belgium False False
168 Belgium Jupiler League Belgium False False
... ... ... ... ...
25972 Switzerland Super League Switzerland False False
25974 Switzerland Super League Switzerland False False
25975 Switzerland Super League Switzerland False False
25976 Switzerland Super League Switzerland False False
25978 Switzerland Super League Switzerland False False
home_player_3 home_player_4 home_player_5 home_player_6 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_7 home_player_8 home_player_9 home_player_10 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_11 away_player_1 away_player_2 away_player_3 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_4 away_player_5 away_player_6 away_player_7 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_8 away_player_9 away_player_10 away_player_11 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
Winning Team
145
153
155
162
168
... ...
25972
25974
25975
25976
25978
[21374 rows x 47 columns]>
I need to send multiple columns as input arguements in order to find who's won but i'm not sure if .apply()
allows you to pass in a series as an argument? Is it possible to do this using .apply()
. If so a solution would be helpful, but if anyone knows a better alternative that would also be useful.
Upvotes: 1
Views: 88
Reputation: 862701
Use numpy.select
instead apply
, which should be not good idea here (loops under the hood):
match = pd.DataFrame({
'Home Team':list('abcdef'),
'Away Team':list('ghijkl'),
'home_team_goal':[14,5,4,5,5,4],
'away_team_goal':[7,8,2,4,8,4],
})
m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'], match['Away Team']], default=None)
print (match)
Home Team Away Team home_team_goal away_team_goal winner
0 a g 14 7 a
1 b h 5 8 h
2 c i 4 2 c
3 d j 5 4 d
4 e k 5 8 k
5 f l 4 4 None
Upvotes: 2
Reputation: 3184
Here's another way:
Creating dataframe with the teams:
results_di = {
"Home Team": [
"KV Mechelen",
"KSV Cercle Brugge",
"RSC Anderlecht",
"KV Mechelen",
"SV Zulte-Waregem",
"FC Zürich",
"FC St. Gallen",
"FC Vaduz",
"Grasshopper Club Zürich",
"BSC Young Boys",
],
"Away Team": [
"KRC Genk",
"Club Brugge KV",
"SV Zulte-Waregem",
"RSC Anderlecht",
"KSV Roeselare",
"FC Thun",
"FC Thun",
"FC Luzern",
"FC Sion",
"FC Basel",
],
"home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
"away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}
df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0
df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']
print(df)
Home Team Away Team home_team_goal away_team_goal \
0 KV Mechelen KRC Genk 2 1
1 KSV Cercle Brugge Club Brugge KV 1 3
2 RSC Anderlecht SV Zulte-Waregem 2 0
3 KV Mechelen RSC Anderlecht 2 1
4 SV Zulte-Waregem KSV Roeselare 0 0
5 FC Zürich FC Thun 3 3
6 FC St. Gallen FC Thun 1 0
7 FC Vaduz FC Luzern 1 2
8 Grasshopper Club Zürich FC Sion 2 0
9 BSC Young Boys FC Basel 4 3
winner
0 KV Mechelen
1 Club Brugge KV
2 RSC Anderlecht
3 KV Mechelen
4 none
5 none
6 FC St. Gallen
7 FC Luzern
8 Grasshopper Club Zürich
9 BSC Young Boys
Upvotes: 1
Reputation: 107652
Avoid DataFrame.apply
(which is usually run as a hidden loop) and instead consider nested conditional logic with numpy.where
on columns:
match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
match['Home Team'],
np.where(match['home_team_goal'] < match['away_team_goal'],
match['Away Team'],
np.nan
)
)
Upvotes: 1