dowjie
dowjie

Reputation: 95

Rolling over a series by avoiding NaNs

How to perform rolling computation by avoiding NaN values in my case ?

My series:

2019-05-01    0.1
2019-05-02    0.2
2019-05-03    NaN
2019-05-04    NaN
2019-05-05    NaN
2019-05-06    0.1
2019-05-07    0.5
2019-05-08    NaN
2019-05-09    0.1
2019-05-10    0.2
2019-05-11    NaN
2019-05-12    NaN
2019-05-13    0.3

I need to compute the mean of period 2 of this series in a way that my output is:

2019-05-01     NaN
2019-05-02    0.15
2019-05-03     NaN
2019-05-04     NaN
2019-05-05     NaN
2019-05-06    0.15
2019-05-07    0.30
2019-05-08     NaN
2019-05-09    0.30
2019-05-10    0.15
2019-05-11     NaN
2019-05-12     NaN
2019-05-13    0.25

Using rolling, if you do not have 2 subsequent non-NaN values, the mean will return NaN so it doesn't work (below the result with dropping NaNs):

2019-05-01     NaN
2019-05-02    0.15
2019-05-03     NaN
2019-05-04     NaN
2019-05-05     NaN
2019-05-06     NaN
2019-05-07    0.30
2019-05-08     NaN
2019-05-09     NaN
2019-05-10    0.15
2019-05-11     NaN
2019-05-12     NaN
2019-05-13     NaN

Upvotes: 6

Views: 555

Answers (2)

MaPy
MaPy

Reputation: 505

You can impute your data using the last presented value:

temp = df.fillna(method='ffill')

Then calculate the rolling mean:

temp = temp.rolling(2).mean()

and finally replace NaNs:

temp.loc[np.isnan(df)] = np.nan

Upvotes: 0

BENY
BENY

Reputation: 323276

In your case dropna first then rolling reindex back

s.dropna().rolling(2).mean().reindex(s.index)
Out[796]: 
2019-05-01     NaN
2019-05-02    0.15
2019-05-03     NaN
2019-05-04     NaN
2019-05-05     NaN
2019-05-06    0.15
2019-05-07    0.30
2019-05-08     NaN
2019-05-09    0.30
2019-05-10    0.15
2019-05-11     NaN
2019-05-12     NaN
2019-05-13    0.25
Name: x, dtype: float64

Upvotes: 6

Related Questions