areddy
areddy

Reputation: 383

interpolate/extrapolate missing dates in python?

lets say i have the following dataframe

bb = pd.DataFrame(data = {'date' :['','','','2015-09-02', '2015-09-02', '2015-09-03','','2015-09-08', '', '2015-09-11','2015-09-14','','' ]})     
bb['date'] = pd.to_datetime(bb['date'], format="%Y-%m-%d")     

I want to interpolate and exptrapolate linearly to fill the missing date values . I used the following code but it doesn't change anything. I am new to pandas. please help

bb= bb.interpolate(method='time')

Upvotes: 4

Views: 3475

Answers (1)

Serenity
Serenity

Reputation: 36635

To extrapolate you have to use bfill() and ffill(). Missing values will be assigned by back- (or forward) values.

To linear interpolate you have to use function interpolate but dates need to convert to numbers:

import numpy as np
import pandas as pd
from datetime import datetime

bb = pd.DataFrame(data = {'date' :['','','','2015-09-02', '2015-09-02', '2015-09-03','','2015-09-08', '', '2015-09-11','2015-09-14','','' ]})     
bb['date'] = pd.to_datetime(bb['date'], format="%Y-%m-%d")     

# convert to seconds
tmp = bb['date'].apply(lambda t: (t-datetime(1970,1,1)).total_seconds())
# linear interpolation
tmp.interpolate(inplace=True)    
# back convert to dates
bb['date'] = pd.to_datetime(tmp, unit='s') 
bb['date'] = bb['date'].apply(lambda t: t.date())
# extrapolation for the first missing values
bb.bfill(inplace='True')

print bb

Result:

         date
0  2015-09-02
1  2015-09-02
2  2015-09-02
3  2015-09-02
4  2015-09-02
5  2015-09-03
6  2015-09-05
7  2015-09-08
8  2015-09-09
9  2015-09-11
10 2015-09-14
11 2015-09-14
12 2015-09-14

Upvotes: 3

Related Questions