Reputation: 43
Let say I have some data associated with football
Date Home Away HomeGoal AwayGoal TotalGoal
2019 Arsenal MU 5 1 6
2019 MCity Liv 2 2 4
2019 MU Liv 3 4 7
2019 MCity MU 0 0 0
I want to create a column of data which show the average goals of that team in the recent 2 games. For example in the last row, I want to include a column that shows the average goal of MU in their last 2 games, which = (1+3)/2 = 2.
Is there any functions in python can achieve this?
Upvotes: 3
Views: 125
Reputation: 93161
For your requirement, you don't care if a team is Home or Away, only how many goal it scored per match. Try this:
# Rename the columns to make the unstacking operation a bit easier
# Always a good idea to specify an explicit `copy` when you intend
# to change the dataframe structure
>>> tmp = df[['Home', 'Away', 'HomeGoal', 'AwayGoal']].copy()
# Arrange the columns into a MultiIndex to make stacking easier
>>> tmp.columns = pd.MultiIndex.from_product([['Team', 'Goal'], ['Home', 'Away']])
# This is what `tmp` look like:
Team Goal
Home Away Home Away
0 Arsenal MU 5 1
1 MCity Liv 2 2
2 MU Liv 3 4
3 MCity MU 0 0
# And now the magic
>>> tmp.stack() \
.groupby('Team').rolling(2).mean() \
.groupby('Team').tail(1) \
.droplevel([1,2])
# Result
Goal
Team
Arsenal NaN
Liv 3.0
MCity 1.0
MU 1.5
Here's how it works:
stack
unpivots Home
and Away
so that for every match, we have 2 rows for Teams
and Goal
groupby('Team').rolling(2).mean()
gets the rolling average of goal scored over the last 2 games per teamgroupby('Team').tail(1)
gets the last of those rolling averages per teamUpvotes: 1
Reputation: 1310
Try this way:
Split into two Data frames according to their Home
and Away
Goals
df1=df[['Date','Home','HomeGoal']]
df2 = df[['Date','Away','AwayGoal']]
all_dfs=[df1,df2]
Name the columns
for dfs in all_dfs:
dfs.columns = ['Date','Team', 'Goal']
Concat two dfs together
new_df=pd.concat(all_dfs,ignore_index=True).reset_index(drop=True)
Date Team Goal
0 2019 Arsenal 5
1 2019 Mcity 2
2 2019 MU 3
3 2019 Mcity 0
4 2019 MU 1
5 2019 Liv 2
6 2019 Liv 4
7 2019 MU 0
Average of last two games:
new_df[new_df['Team'] == 'MU'].sort_values('Date')['Goal'][:2].sum()/2
Total Goal wrt to Team in both away and home matches
new_df.groupby('Team')['Goal'].sum()
Team
Arsenal 5
Liv 6
MU 4
Mcity 2
Upvotes: 1