EJ Kang
EJ Kang

Reputation: 495

Python Interpolate Monthly value to Daily value (linear): Pandas

I have a pandas dataframe with column of year month data(yyyymm). I am planning on interpolate data to daily & weekly values. Here is my df below.

df:
    201301      201302      201303      ...    201709       201710  
a   0.747711    0.793101    0.771819    ...    0.818161     0.812522 
b   0.776537    0.759745    0.733673    ...    0.757496     0.765181 
c   0.801699    0.847655    0.796586    ...    0.784537     0.763551 
d   0.797942    0.687899    0.729911    ...    0.819887     0.772395 
e   0.777472    0.799676    0.782947    ...    0.804533     0.791759 
f   0.780933    0.750774    0.781056    ...    0.790846     0.773705 
g   2.071699    2.261739    2.126915    ...    1.891780     2.098914 

As you can see that my df is in montly column data and I am hoping to change this to daily values. I am planning on using linear functions. Here is example.

# (201302 - 201301)/31 (since January 2013 has 31 days)
a = (0.793101-0.747711)/31
# now a is the daily increasing (or decresing depends on values) value for a day.
# 2013-01-01 value woud be
0.747711
# 2013-01-02 value woud be
0.747711 + a
# 2013-01-03 value woud be
0.747711 + (a*2)
# last day of January would be
0.747711 + (a*30)
# first day of Feb would be
0.747711 + (a*31) which is 0.793101 (201302 value)

So my df_daily would have every day from 2013 to 2017 Oct(first day) and the values would be just like above. I am very week on working with timestamps so it would be great if there is any way to interpolate my value from month to day values. Thanks!

Oh please let me know if my question is confusing...

Upvotes: 2

Views: 4248

Answers (1)

jezrael
jezrael

Reputation: 863166

First convert columns to datetimes by to_datetime, then reindex for NaNs for missing days and last interpolate:

df.columns = pd.to_datetime(df.columns, format='%Y%m')
#by first and last values of columns
rng = pd.date_range(df.columns[0], df.columns[-1])
#alternatively min and max of columns
#rng = pd.date_range(df.columns.min(), df.columns.max())

df = df.reindex(rng, axis=1).interpolate(axis=1)

Verify solution:

a = (0.793101-0.747711)/31
print (0.747711 + a)
print (0.747711 + a*2)
print (0.747711 + a*3)
0.7491751935483871
0.7506393870967742
0.7521035806451613

print (df)
  2013-01-01  2013-01-02  2013-01-03  2013-01-04  2013-01-05  2013-01-06  \
a    0.747711    0.749175    0.750639    0.752104    0.753568    0.755032   
b    0.776537    0.775995    0.775454    0.774912    0.774370    0.773829   
c    0.801699    0.803181    0.804664    0.806146    0.807629    0.809111   
d    0.797942    0.794392    0.790842    0.787293    0.783743    0.780193   
e    0.777472    0.778188    0.778905    0.779621    0.780337    0.781053   
f    0.780933    0.779960    0.778987    0.778014    0.777042    0.776069   
g    2.071699    2.077829    2.083960    2.090090    2.096220    2.102351   

   2013-01-07  2013-01-08  2013-01-09  2013-01-10     ...      2017-09-22  \
a    0.756496    0.757960    0.759425    0.760889     ...        0.814214   
b    0.773287    0.772745    0.772204    0.771662     ...        0.762876   
c    0.810594    0.812076    0.813559    0.815041     ...        0.769847   
d    0.776643    0.773094    0.769544    0.765994     ...        0.786643   
e    0.781770    0.782486    0.783202    0.783918     ...        0.795591   
f    0.775096    0.774123    0.773150    0.772177     ...        0.778847   
g    2.108481    2.114611    2.120742    2.126872     ...        2.036774   

   2017-09-23  2017-09-24  2017-09-25  2017-09-26  2017-09-27  2017-09-28  \
a    0.814026    0.813838    0.813650    0.813462    0.813274    0.813086   
b    0.763132    0.763388    0.763644    0.763900    0.764156    0.764413   
c    0.769147    0.768448    0.767748    0.767049    0.766349    0.765650   
d    0.785060    0.783476    0.781893    0.780310    0.778727    0.777144   
e    0.795165    0.794740    0.794314    0.793888    0.793462    0.793036   
f    0.778276    0.777705    0.777133    0.776562    0.775990    0.775419   
g    2.043678    2.050583    2.057487    2.064392    2.071296    2.078201   

   2017-09-29  2017-09-30  2017-10-01  
a    0.812898    0.812710    0.812522  
b    0.764669    0.764925    0.765181  
c    0.764950    0.764251    0.763551  
d    0.775561    0.773978    0.772395  
e    0.792611    0.792185    0.791759  
f    0.774848    0.774276    0.773705  
g    2.085105    2.092010    2.098914  

[7 rows x 1735 columns]

Upvotes: 4

Related Questions