Reputation: 3
Given the following dataset, I must insert the last winner of each tennis confrontation if there is one.
player_1 | player_2 | winner | last_winner |
---|---|---|---|
Federer | Nadal | Federer | |
Djoko | Federer | Djoko | |
Nadal | Federer | Nadal | |
Djoko | Federer | Federer | |
Murray | Djoko | Murray | |
Djoko | Federer | Djoko |
Then it should look like this :
player_1 | player_2 | winner | last_winner |
---|---|---|---|
Federer | Nadal | Federer | none |
Djoko | Federer | Djoko | none |
Nadal | Federer | Nadal | Federer |
Djoko | Federer | Federer | Djoko |
Murray | Djoko | Murray | none |
Djoko | Federer | Djoko | Federer |
This is the code I use with a for loop :
for i,j in tennis.iterrows() :
try :
tennis.loc[i, 'last_winner'] = tennis.iloc[:i].loc[(tennis.player_1.isin([j.player_1, j.player_2]) & tennis.player_2.isin([j.player_1, j.player_2])), "winner"].iloc[-1]
except:
tennis.loc[i, 'last_winner'] = "none"
Is there any mean to perform the same operation without any for loop, using something like an apply function ?
Upvotes: 0
Views: 30
Reputation: 2787
In a single line of code:
df["last_winner"] = df.groupby(
pd.Series(
df[["player_1", "player_2"]].values.tolist()) \
.apply(sorted).astype(str)) \
["winner"].shift()
This can be split up into sections:
pd.Series
)Groupby
this series"winner"
column for each group and shift
.Upvotes: 1