santon
santon

Reputation: 4605

Forward fill pandas column not with last value, but with mean over non-null and null elements

I experience this a lot in modeling time series. Sometimes you may have data reported at different frequencies, say one daily and one weekly. What I'd like is not to forward fill the weekly data point for every day of the week (since it is usually a sum of all the values of during the week already), but forward fill or replace the data with it's mean. In essence, I'd like to spread out the data.

So if I have

s = pd.Series(index=pd.date_range('2015/1/1', '2015/1/9'), 
             data=[2, np.nan, 6, np.nan, np.nan, 2, np.nan, np.nan, np.nan])

then I'd like to return

2015-01-01     1
2015-01-02     1
2015-01-03     2
2015-01-04     2
2015-01-05     2
2015-01-06   0.5
2015-01-07   0.5
2015-01-08   0.5
2015-01-09   0.5
Freq: D, dtype: float64

Any thoughts on an easy way to do this? Is a for-loop inescapable?

Upvotes: 1

Views: 309

Answers (1)

Jianxun Li
Jianxun Li

Reputation: 24742

Here is one way using .cumcount to separate series into different groups and then transform.

s.fillna(method='ffill').groupby(s.notnull().cumsum()).transform(lambda g: g/len(g))

2015-01-01    1.0
2015-01-02    1.0
2015-01-03    2.0
2015-01-04    2.0
2015-01-05    2.0
2015-01-06    0.5
2015-01-07    0.5
2015-01-08    0.5
2015-01-09    0.5
Freq: D, dtype: float64

Upvotes: 4

Related Questions