FLab
FLab

Reputation: 7506

Pandas resample and ffill leaves NaN at the end

I want to up-sample a series from weekly to daily frequency by forward filling the result.

If the last observation of my original series is NaN, I would have expected this value to be replaced by the previous valid value, but instead it remains as NaN.

SETUP

import numpy as np
import pandas as pd

all_dates = pd.date_range(start='2018-01-01', freq='W-WED', periods=4)

ts = pd.Series([1, 2, 3], index=all_dates[:3])
ts[all_dates[3]] = np.nan

ts
Out[16]: 
2018-01-03    1.0
2018-01-10    2.0
2018-01-17    3.0
2018-01-24    NaN
Freq: W-WED, dtype: float64

RESULT

ts.resample('B').ffill() 

ts.resample('B').ffill()
Out[17]: 
2018-01-03    1.0
2018-01-04    1.0
2018-01-05    1.0
2018-01-08    1.0
2018-01-09    1.0
2018-01-10    2.0
2018-01-11    2.0
2018-01-12    2.0
2018-01-15    2.0
2018-01-16    2.0
2018-01-17    3.0
2018-01-18    3.0
2018-01-19    3.0
2018-01-22    3.0
2018-01-23    3.0
2018-01-24    NaN
Freq: B, dtype: float64

While I was expecting the last value to be 3 as well.

Does anyone has an explanation of this behaviour?

Upvotes: 1

Views: 2700

Answers (2)

Sugimiyanto
Sugimiyanto

Reputation: 329

resample() returns DatetimeIndexResampler

You need to return the original pandas Series.

You can use asfreq() method to do it, before filling the Nan https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.asfreq.html.

So, this should work:

ts.resample('B').asfreq().ffill()

Upvotes: 2

Josh Friedlander
Josh Friedlander

Reputation: 11657

The point of resample and ffillis simply to propagate forward from the first day of the week - if the first day of the week is NaN, that's what gets filled forward. For example:

ts.iloc[1] = np.nan
ts.resample('B').ffill()

2018-01-03    1.0
2018-01-04    1.0
2018-01-05    1.0
2018-01-08    1.0
2018-01-09    1.0
2018-01-10    NaN
2018-01-11    NaN
2018-01-12    NaN
2018-01-15    NaN
2018-01-16    NaN
2018-01-17    3.0
2018-01-18    3.0
2018-01-19    3.0
2018-01-22    3.0
2018-01-23    3.0
2018-01-24    NaN
Freq: B, dtype: float64

In most cases, propagating from the previous week's data would not be desired behaviour. If you'd like to use previous weeks' data in the case of missing values in the original (weekly) series, just fillna on that first with ffill.

Upvotes: 1

Related Questions