Reputation: 49
Suppose i have a large dataframe similar to the structure below
home| away| home_score| away_score
A| B| 1| 0
B| C| 1| 1
C| A| 1| 0
I want to find the last score regardless of home / away. For example, last score of team A, B and C are 0, 1 and 1 respectively and fill back to the original dataframe:
home| away| home_score| away_score| last_score_home| last_score_away|
A| B| 1| 0| | |
B| C| 1| 1| 0| |
C| A| 1| 0| 1| 1|
...
I have tried groupby and shift but I am not sure how to combine the home / away results.
Upvotes: 4
Views: 77
Reputation: 215057
You can try something as this. 1) make all column names splittable by adding suffix to the first two columns names; 2) split the column headers and transform it to multi index; 3) melt table to long format with stack
, group by the teams and get the latest score:
df.columns = df.columns.str.replace("^([^_]+)$", "\\1_team").str.split("_", expand=True)
df.stack(level=0).groupby("team").tail(1)
# score team
#1 home 1 B
#2 away 0 A
# home 1 C
Update:
To merge it back to the original data frame, you can use join
:
df.columns = df.columns.str.replace("^([^_]+)$", "\\1_team").str.split("_", expand=True)
df1 = df.stack(level=0).groupby("team").tail(1)
# join the result back to the original transformed data frame
df2 = df.stack(level=0).join(df1.score, rsuffix = "_last").unstack(level=1)
df2.columns = [x + "_" + y for x, y in df2.columns]
df2
Upvotes: 4