Tartaglia
Tartaglia

Reputation: 1041

Conversion of Daily pandas dataframe to minute frequency does not work for 2 row dataframe

I am trying to convert a daily frequency dataframe to minute data, and in a previous post it was suggested to use the ffil method below but it does not seem to work with dataframes that consist of only 2 rows (Conversion of Daily pandas dataframe to minute frequency).

So the below dataframe is supposed to be converted.

import pandas as pd
dict = [
        {'ticker':'jpm','date': '2016-11-28','returns': 0.2},
{ 'ticker':'ge','date': '2016-11-28','returns': 0.2},
{'ticker':'fb', 'date': '2016-11-28','returns': 0.2},
{'ticker':'aapl', 'date': '2016-11-28','returns': 0.2},
{'ticker':'msft','date': '2016-11-28','returns': 0.2},
{'ticker':'amzn','date': '2016-11-28','returns': 0.2},
{'ticker':'jpm','date': '2016-11-29','returns': 0.2},
{'ticker':'ge', 'date': '2016-11-29','returns': 0.2},
{'ticker':'fb','date': '2016-11-29','returns': 0.2},
{'ticker':'aapl','date': '2016-11-29','returns': 0.2},
{'ticker':'msft','date': '2016-11-29','returns': 0.2},
{'ticker':'amzn','date': '2016-11-29','returns': 0.2}
]
df = pd.DataFrame(dict)
df['date']      = pd.to_datetime(df['date'])
df=df.set_index(['date','ticker'], drop=True)  

This works on the entire dataframe:

df_min = df.unstack().asfreq('Min', method='ffill').between_time('8:30','16:00').stack()

But when I work with a smaller dataframe it returns an empty dataframe for some reason:

df2=df.iloc[0:2,:]

df2_min = df2.unstack().asfreq('Min', method='ffill').between_time('8:30','16:00').stack()

Does anyone have an explanation for this odd behaviour?

edt: I noticed the code only works if the dataframe has at least 7 rows.

Upvotes: 1

Views: 82

Answers (1)

jezrael
jezrael

Reputation: 862681

If you have only 2 row input DataFrame then after reshape by unstack get one row DataFrame and pandas cannot create continous minute DataFrame, because only one value of DatetimeIndex.

Possible solution is add next day after reshape, fill it last previous row data, apply solution and in last steps remove last helper row by positions with iloc:

df2=df.iloc[0:2]
print (df2)
                   returns
date       ticker         
2016-11-28 jpm         0.2
           ge          0.2

df3 = df2.unstack()
print (df3)
ticker         jpm   ge
date                   
2016-11-28     0.2  0.2
df3.loc[df3.index.max() + pd.Timedelta(1, unit='d')] = df3.iloc[-1]
print (df3)
           returns     
ticker         jpm   ge
date                   
2016-11-28     0.2  0.2
2016-11-29     0.2  0.2 <- helper row

df_min = df3.asfreq('Min', method='ffill')
print (df_min.tail())
                    returns     
ticker                  jpm   ge
date                            
2016-11-28 23:56:00     0.2  0.2
2016-11-28 23:57:00     0.2  0.2
2016-11-28 23:58:00     0.2  0.2
2016-11-28 23:59:00     0.2  0.2
2016-11-29 00:00:00     0.2  0.2 <- helper row

df_min = df_min.iloc[:-1].between_time('8:30','16:00').stack()
#print (df_min)

Upvotes: 1

Related Questions