Robert Melikyan
Robert Melikyan

Reputation: 52

How to apply a function/impute on an interval in Pandas

I have a Pandas dataset with a monthly Date-time index and a column of outstanding orders (like below):

Date orders
1991-01-01 nan
1991-02-01 nan
1991-03-01 24
1991-04-01 nan
1991-05-01 nan
1991-06-01 nan
1991-07-01 nan
1991-08-01 34
1991-09-01 nan
1991-10-01 nan
1991-11-01 22
1991-12-01 nan

I want to linearly interpolate the values to fill the nans. However it has to be applied within 6-month blocks (non-rolling). So for example, one 6-month block would be all the rows between 1991-01-01 and 1991-06-01, where we would do forward and backward linear imputation such that if there is a nan the interpolation would be descending to a final value of 0. So for the same dataset above here is how I would like the end result to look:

Date orders
1991-01-01 8
1991-02-01 16
1991-03-01 24
1991-04-01 18
1991-05-01 12
1991-06-01 6
1991-07-01 17
1991-08-01 34
1991-09-01 30
1991-10-01 26
1991-11-01 22
1991-12-01 11

I am lost on how to do this in Pandas however. Any ideas?

Upvotes: 1

Views: 132

Answers (1)

jezrael
jezrael

Reputation: 862511

Idea is grouping per 6 months with prepend and append 0 values, interpolate and then remove first and last 0 values per groups:

df['Date'] = pd.to_datetime(df['Date'])


f = lambda x: pd.Series([0] + x.tolist() + [0]).interpolate().iloc[1:-1]
df['orders'] = (df.groupby(pd.Grouper(freq='6MS', key='Date'))['orders']
                  .transform(f))

print (df)
        Date  orders
0  1991-01-01     8.0
1  1991-02-01    16.0
2  1991-03-01    24.0
3  1991-04-01    18.0
4  1991-05-01    12.0
5  1991-06-01     6.0
6  1991-07-01    17.0
7  1991-08-01    34.0
8  1991-09-01    30.0
9  1991-10-01    26.0
10 1991-11-01    22.0
11 1991-12-01    11.0

Upvotes: 1

Related Questions