Konninger
Konninger

Reputation: 87

pandas:conditional rolling sum of two columns

I'd like to calculate a measure for momentum of soccer teams, in this case the points that a certain team got in its last 3 games. My data looks as follows:

    HomeTeam    AwayTeam    H_Pts   A_Pts
    Barcelona   Getafe      3       0
    Levante     Barcelona   1       1
    Barcelona   Las Palmas  3       0
    Las Palmas  Barcelona   3       0
    Barcelona   Madrid      1       1

This is just a sample for some Barcelona games. So basically what i want to end up with is two additional columns (say Home_Momentum, Away_Momentum), which adds up the points that this particular team got in its last 3 matches (not including the current one). So it should look something like this:

    HomeTeam    AwayTeam    H_Pts   A_Pts    Home_Momentum    Away_Momentum
    Barcelona   Getafe      3       0        NaN              NaN
    Levante     Barcelona   1       1        NaN              NaN
    Barcelona   Las Palmas  3       0        NaN              NaN
    Las Palmas  Barcelona   3       0        x                7    
    Barcelona   Madrid      1       1        4                y

where x (y) is the sum of points that Las Palmas (Madrid) collected in its last 3 games.

What I've come up with so far is:

data["Home_Momentum"] = data.groupby("HomeTeam")["H_Pts"].apply(lambda x: x.rolling(3).sum().shift())

But the problem with this is that it doesnt consider the away games of the team.

Do you have any ideas how to solve this?

Upvotes: 1

Views: 1034

Answers (1)

piRSquared
piRSquared

Reputation: 294508

Rename, columns to be a multiindex. The stack and run a rolling sum

df.columns = [
    ['Team', 'Team', 'Points', 'Points'],
    ['Home', 'Away', 'Home', 'Away']
]

d1 = df.stack()

mom = d1.groupby('Team').Points.apply(lambda x: x.shift().rolling(3).sum())

d1.assign(Momentum=mom).unstack()

  Points             Team             Momentum     
    Away Home        Away        Home     Away Home
0      0    3      Getafe   Barcelona      NaN  NaN
1      1    1   Barcelona     Levante      NaN  NaN
2      0    3  Las Palmas   Barcelona      NaN  NaN
3      0    3   Barcelona  Las Palmas      7.0  NaN
4      1    1      Madrid   Barcelona      NaN  4.0

We can include sums over few games than 3 as well.

df.columns = [
    ['Team', 'Team', 'Points', 'Points'],
    ['Home', 'Away', 'Home', 'Away']
]

d1 = df.stack()

mom = d1.groupby('Team').Points.apply(lambda x: x.shift().rolling(3, 1).sum())

d1.assign(Momentum=mom).unstack()

  Points             Team             Momentum     
    Away Home        Away        Home     Away Home
0      0    3      Getafe   Barcelona      NaN  NaN
1      1    1   Barcelona     Levante      3.0  NaN
2      0    3  Las Palmas   Barcelona      NaN  4.0
3      0    3   Barcelona  Las Palmas      7.0  0.0
4      1    1      Madrid   Barcelona      NaN  4.0

Upvotes: 2

Related Questions