Removing Nans occuring after rolling operation on pandas dataframe

Question

I have the foll. dataframe:

datetime
2001-01-01    3.22800
2002-01-01    2.69000
2003-01-01    2.75725
2004-01-01    3.22800
2005-01-01    3.36250
2006-01-01    3.83325
2007-01-01    2.82450
2008-01-01    4.77475
2009-01-01    3.63150
2010-01-01    3.69875
2011-01-01    4.90925
2012-01-01    3.96775
2013-01-01    4.64025
2014-01-01    4.64025
2015-01-01    4.57300

I want to compute mean with a rolling window size of 3.

df.rolling(window=3).mean()

This gives me:

datetime
2001-01-01         NaN
2002-01-01         NaN
2003-01-01    2.891750
2004-01-01    2.891750
2005-01-01    3.115917
2006-01-01    3.474583
2007-01-01    3.340083
2008-01-01    3.810833
2009-01-01    3.743583
2010-01-01    4.035000
2011-01-01    4.079833
2012-01-01    4.191917
2013-01-01    4.505750
2014-01-01    4.416083
2015-01-01    4.617833

Is there a way to compute rolling mean (or any other function such that no NaNs are obtained? So, for the first 2 rows in the input, we will use the next 3 rows and for the last 2 rows of the dataframe, we use the previous 3 rows?

I cannot assume that the data is circular i.e. the first row does not follow the last row.

Scott Boston · Accepted Answer

Use shift and ffill:

df.rolling(window=3).mean().shift(-2).ffill()

Output:

2001-01-01  2.891750
2002-01-01  2.891750
2003-01-01  3.115917
2004-01-01  3.474583
2005-01-01  3.340083
2006-01-01  3.810833
2007-01-01  3.743583
2008-01-01  4.035000
2009-01-01  4.079833
2010-01-01  4.191917
2011-01-01  4.505750
2012-01-01  4.416083
2013-01-01  4.617833
2014-01-01  4.617833
2015-01-01  4.617833

Removing Nans occuring after rolling operation on pandas dataframe

Answers (1)

Related Questions