Reputation: 161
I have a table of IDs and dates of quarterly data and i would like to reindex this to daily (weekdays).
Example table:
I'm trying to figure out a pythonic or pandas way to reindex to a higher frequency date range e.g. daily and forward fill any NaNs.
so far have tried:
df = pd.read_sql('select date, id, type, value from db_table' con=conn, index_col=['date', 'id', 'type'])
dates = pd.bdate_range(start, end)
new_idx = pd.MultiIndex.from_product([dates, df.index.get_level_values(1), df.index.get_level_values(2)]
new_df = df.reindex(new_idx)
#this just hangs
new_df = new_df.groupby(level=1).fillna(method='ffill')
to no avail. I either get a
Exception: cannot handle a non-unique multi-index!
Or, if the dates are consistent between ids and types the individual dates are reproduced multiple times (which sounds like a bug?)
Ultimately I would just like to group the table by date, id and type and have a consistent date index across ids and types.
Is there a way to do this in pandas?
Upvotes: 1
Views: 332
Reputation: 161
Wen-Ben's answer is almost there - thank you for that. The only thing missing is grouping by ['id', 'type'] when doing the forward fill.
Further, when creating the new multindex in my use case should have unique values:
new_idx = pd.MultiIndex.from_product([dates, df.index.get_level_values(1).unique(), df.index.get_level_values(2).unique()])
Upvotes: 0
Reputation: 323326
Yes you can do with merge
new_idx_frame=new_idx.to_frame()
new_idx_frame.columns=['date', 'id', 'type']
Yourdf=df.reset_index().merge(new_idx_frame,how='right',sort =True).groupby('id').ffill()# here I am using toy data
Out[408]:
id date type value
0 1 1 1 NaN
1 1 1 2 NaN
2 2 1 1 666666.0
3 2 1 2 99999.0
4 1 2 1 -1.0
5 1 2 1 -1.0
6 1 2 2 -1.0
7 2 2 1 99999.0
8 2 2 2 99999.0
Sample data
df=pd.DataFrame({'date':[1,1,2,2],'id':[2,2,1,1],'type':[2,1,1,1],'value':[99999,666666,-1,-1]})
df=df.set_index(['date', 'id', 'type'])
new_idx = pd.MultiIndex.from_product([[1,2], [1,2],[1,2]])
Upvotes: 2