Reputation: 95
How to perform rolling computation by avoiding NaN
values in my case ?
My series:
2019-05-01 0.1
2019-05-02 0.2
2019-05-03 NaN
2019-05-04 NaN
2019-05-05 NaN
2019-05-06 0.1
2019-05-07 0.5
2019-05-08 NaN
2019-05-09 0.1
2019-05-10 0.2
2019-05-11 NaN
2019-05-12 NaN
2019-05-13 0.3
I need to compute the mean of period 2 of this series in a way that my output is:
2019-05-01 NaN
2019-05-02 0.15
2019-05-03 NaN
2019-05-04 NaN
2019-05-05 NaN
2019-05-06 0.15
2019-05-07 0.30
2019-05-08 NaN
2019-05-09 0.30
2019-05-10 0.15
2019-05-11 NaN
2019-05-12 NaN
2019-05-13 0.25
Using rolling
, if you do not have 2 subsequent non-NaN
values, the mean will return NaN so it doesn't work (below the result with dropping NaNs):
2019-05-01 NaN
2019-05-02 0.15
2019-05-03 NaN
2019-05-04 NaN
2019-05-05 NaN
2019-05-06 NaN
2019-05-07 0.30
2019-05-08 NaN
2019-05-09 NaN
2019-05-10 0.15
2019-05-11 NaN
2019-05-12 NaN
2019-05-13 NaN
Upvotes: 6
Views: 555
Reputation: 505
You can impute your data using the last presented value:
temp = df.fillna(method='ffill')
Then calculate the rolling mean:
temp = temp.rolling(2).mean()
and finally replace NaNs:
temp.loc[np.isnan(df)] = np.nan
Upvotes: 0
Reputation: 323276
In your case dropna
first then rolling
reindex
back
s.dropna().rolling(2).mean().reindex(s.index)
Out[796]:
2019-05-01 NaN
2019-05-02 0.15
2019-05-03 NaN
2019-05-04 NaN
2019-05-05 NaN
2019-05-06 0.15
2019-05-07 0.30
2019-05-08 NaN
2019-05-09 0.30
2019-05-10 0.15
2019-05-11 NaN
2019-05-12 NaN
2019-05-13 0.25
Name: x, dtype: float64
Upvotes: 6