Pandas iterative imputation without using 'for' loop

Question

Given the following dataset, I must insert the last winner of each tennis confrontation if there is one.

player_1	player_2	winner
Federer	Nadal	Federer
Djoko	Federer	Djoko
Nadal	Federer	Nadal
Djoko	Federer	Federer
Murray	Djoko	Murray
Djoko	Federer	Djoko

Then it should look like this :

player_1	player_2	winner	last_winner
Federer	Nadal	Federer	none
Djoko	Federer	Djoko	none
Nadal	Federer	Nadal	Federer
Djoko	Federer	Federer	Djoko
Murray	Djoko	Murray	none
Djoko	Federer	Djoko	Federer

This is the code I use with a for loop :

for i,j in tennis.iterrows() :
    
    try :
        tennis.loc[i, 'last_winner'] = tennis.iloc[:i].loc[(tennis.player_1.isin([j.player_1, j.player_2]) & tennis.player_2.isin([j.player_1, j.player_2])), "winner"].iloc[-1]
    except:
        tennis.loc[i, 'last_winner'] = "none"

Is there any mean to perform the same operation without any for loop, using something like an apply function ?

Rawson · Accepted Answer

In a single line of code:

df["last_winner"] = df.groupby(
    pd.Series(
        df[["player_1", "player_2"]].values.tolist()) \
        .apply(sorted).astype(str)) \
    ["winner"].shift()

This can be split up into sections:

Create a list for each row of players 1 and 2 (as a pd.Series)
Sort this list so that all rows will match (i.e. doesn't matter if players are 1 or 2 always)
Turn the lists into strings for the groupby
Groupby this series
Take the "winner" column for each group and shift.
Make this a new column of the DataFrame.

Pandas iterative imputation without using 'for' loop

Answers (1)

Related Questions

Pandas iterative imputation without using &#39;for&#39; loop

Answers (1)

Related Questions

Pandas iterative imputation without using 'for' loop