Reputation: 4357
I am trying to get a rolling mean of the past x values. By looking at the documentation it seems that the rolling method includes the last value.
The above behavior can be seen in the following example from the documentation:
In [51]: ser = pd.Series(np.random.randn(10), index=pd.date_range('1/1/2000', periods=10))
In [52]: ser.rolling(window=5, win_type='triang').mean()
Out[52]:
2000-01-01 NaN
2000-01-02 NaN
2000-01-03 NaN
2000-01-04 NaN
2000-01-05 -1.037870
2000-01-06 -0.767705
2000-01-07 -0.383197
2000-01-08 -0.395513
2000-01-09 -0.558440
2000-01-10 -0.672416
Freq: D, dtype: float64
In my specific case, using 5 for the window would take the mean from 2000-01-01 to 2000-01-05 and display it on 2000-01-06.
Below is a more representative example:
Team 1994 1995 1996 1997 1998 1999
Team 1 4 1 4 10 2 1
Team 2 2 5 1 2 1 4
Team 3 4 1 7 3 9 4
Taking the rolling mean for the past 3 seasons would like this:
Team 1994 1995 1996 1997 1998 1999
Team 1 Nan Nan Nan 3.00 5.00 5.33
Team 2 Nan Nan Nan 2.67 2.67 1.33
Team 3 Nan Nan Nan 4.00 3.67 6.33
Upvotes: 4
Views: 5506
Reputation: 294586
If I understand you correctly, then:
ser.rolling(window=5, win_type='triang').mean().shift()
Should do it.
Per your comprehensive example
text = """Team 1994 1995 1996 1997 1998 1999
Team 1 4 1 4 10 2 1
Team 2 2 5 1 2 1 4
Team 3 4 1 7 3 9 4"""
df = pd.read_csv(StringIO(text), delimiter='\s{2,}', engine='python', index_col=0)
print df.T.rolling(3).mean().shift().T
1994 1995 1996 1997 1998 1999
Team
Team 1 NaN NaN NaN 3.000000 5.000000 5.333333
Team 2 NaN NaN NaN 2.666667 2.666667 1.333333
Team 3 NaN NaN NaN 4.000000 3.666667 6.333333
Upvotes: 5