Pandas rolling mean don't change numbers to NaN in DataFrame

Question

I'm working with a pandas DataFrame which looks like this:

(**N.B - the offset is set as the index of the DataFrame)

offset         X         Y         Z
  0   -0.140137   -1.924316   -0.426758
 10   -2.789123   -1.111212   -0.416016
 20   -0.133789   -1.923828   -4.408691
 30   -0.101112   -1.457891   -0.425781
 40   -0.126465   -1.926758   -0.414062
 50   -0.137207   -1.916992   -0.404297
 60   -0.130371   -3.784591   -0.987654
 70   -0.125000   -1.918457   -0.403809
 80   -0.123456   -1.917480   -0.413574
 90   -0.126465   -1.926758   -0.333554

I have applied the rolling mean with window size = 5, to the data frame using the following code. I need to keep this window size = 5 and I need values for the whole dataframe for all of the offset values (no NaNs).

df = df.rolling(center=False, window=5).mean()

Which gives me:

offset         X         Y         Z
 0.0       NaN       NaN       NaN
10.0       NaN       NaN       NaN
20.0       NaN       NaN       NaN
30.0       NaN       NaN       NaN
40.0 -0.658125 -1.668801 -1.218262
50.0 -0.657539 -1.667336 -1.213769
60.0 -0.125789 -2.202012 -1.328097
70.0 -0.124031 -2.200938 -0.527121
80.0 -0.128500 -2.292856 -0.524679
90.0 -0.128500 -2.292856 -0.508578

I would like the DataFrame to be able to keep the first values that are NaN unchanged and have the the rest of the values as the result of the rolling mean. Is there a simple way that I would be able to do this? Thanks

i.e.

offset         X         Y         Z
 0.0  -0.140137  -1.924316  -0.426758
10.0  -2.789123  -1.111212  -0.416016
20.0  -0.133789  -1.923828  -4.408691
30.0  -0.101112  -1.457891  -0.425781
40.0  -0.658125  -1.668801  -1.218262
50.0  -0.657539  -1.667336  -1.213769
60.0  -0.125789  -2.202012  -1.328097
70.0  -0.124031  -2.200938  -0.527121
80.0  -0.128500  -2.292856  -0.524679
90.0  -0.128500  -2.292856  -0.508578

user2285236 · Accepted Answer

You can fill with the original df:

df.rolling(center=False, window=5).mean().fillna(df)
Out: 
               X         Y         Z
offset                              
0      -0.140137 -1.924316 -0.426758
10     -2.789123 -1.111212 -0.416016
20     -0.133789 -1.923828 -4.408691
30     -0.101112 -1.457891 -0.425781
40     -0.658125 -1.668801 -1.218262
50     -0.657539 -1.667336 -1.213769
60     -0.125789 -2.202012 -1.328097
70     -0.124031 -2.200938 -0.527121
80     -0.128500 -2.292856 -0.524679
90     -0.128500 -2.292856 -0.508578

There is also an argument, min_periods that you can use. If you pass min_periods=1 then it will take the first value as it is, second value as the mean of the first two etc. It might make more sense in some cases.

df.rolling(center=False, window=5, min_periods=1).mean()
Out: 
               X         Y         Z
offset                              
0      -0.140137 -1.924316 -0.426758
10     -1.464630 -1.517764 -0.421387
20     -1.021016 -1.653119 -1.750488
30     -0.791040 -1.604312 -1.419311
40     -0.658125 -1.668801 -1.218262
50     -0.657539 -1.667336 -1.213769
60     -0.125789 -2.202012 -1.328097
70     -0.124031 -2.200938 -0.527121
80     -0.128500 -2.292856 -0.524679
90     -0.128500 -2.292856 -0.508578

Pandas rolling mean don't change numbers to NaN in DataFrame

Answers (2)

Related Questions

Pandas rolling mean don&#39;t change numbers to NaN in DataFrame

Answers (2)

Related Questions

Pandas rolling mean don't change numbers to NaN in DataFrame