Forward fill pandas column not with last value, but with mean over non-null and null elements

Question

I experience this a lot in modeling time series. Sometimes you may have data reported at different frequencies, say one daily and one weekly. What I'd like is not to forward fill the weekly data point for every day of the week (since it is usually a sum of all the values of during the week already), but forward fill or replace the data with it's mean. In essence, I'd like to spread out the data.

So if I have

s = pd.Series(index=pd.date_range('2015/1/1', '2015/1/9'), 
             data=[2, np.nan, 6, np.nan, np.nan, 2, np.nan, np.nan, np.nan])

then I'd like to return

2015-01-01     1
2015-01-02     1
2015-01-03     2
2015-01-04     2
2015-01-05     2
2015-01-06   0.5
2015-01-07   0.5
2015-01-08   0.5
2015-01-09   0.5
Freq: D, dtype: float64

Any thoughts on an easy way to do this? Is a for-loop inescapable?

Jianxun Li · Accepted Answer

Here is one way using .cumcount to separate series into different groups and then transform.

s.fillna(method='ffill').groupby(s.notnull().cumsum()).transform(lambda g: g/len(g))

2015-01-01    1.0
2015-01-02    1.0
2015-01-03    2.0
2015-01-04    2.0
2015-01-05    2.0
2015-01-06    0.5
2015-01-07    0.5
2015-01-08    0.5
2015-01-09    0.5
Freq: D, dtype: float64

Forward fill pandas column not with last value, but with mean over non-null and null elements

Answers (1)

Related Questions