James Edwards
James Edwards

Reputation: 161

pandas reindex multindex to higher frequency dates

I have a table of IDs and dates of quarterly data and i would like to reindex this to daily (weekdays).

Example table:

enter image description here

I'm trying to figure out a pythonic or pandas way to reindex to a higher frequency date range e.g. daily and forward fill any NaNs.

so far have tried:

df = pd.read_sql('select date, id, type, value from db_table' con=conn, index_col=['date', 'id', 'type'])
dates = pd.bdate_range(start, end)
new_idx = pd.MultiIndex.from_product([dates, df.index.get_level_values(1), df.index.get_level_values(2)]
new_df = df.reindex(new_idx)

#this just hangs
new_df = new_df.groupby(level=1).fillna(method='ffill')

to no avail. I either get a

Exception: cannot handle a non-unique multi-index!

Or, if the dates are consistent between ids and types the individual dates are reproduced multiple times (which sounds like a bug?)

Ultimately I would just like to group the table by date, id and type and have a consistent date index across ids and types.

Is there a way to do this in pandas?

Upvotes: 1

Views: 332

Answers (2)

James Edwards
James Edwards

Reputation: 161

Wen-Ben's answer is almost there - thank you for that. The only thing missing is grouping by ['id', 'type'] when doing the forward fill.

Further, when creating the new multindex in my use case should have unique values:

new_idx = pd.MultiIndex.from_product([dates, df.index.get_level_values(1).unique(), df.index.get_level_values(2).unique()])

Upvotes: 0

BENY
BENY

Reputation: 323326

Yes you can do with merge

new_idx_frame=new_idx.to_frame()
new_idx_frame.columns=['date', 'id', 'type']
Yourdf=df.reset_index().merge(new_idx_frame,how='right',sort =True).groupby('id').ffill()# here I am using toy data 
Out[408]: 
   id  date  type     value
0   1     1     1       NaN
1   1     1     2       NaN
2   2     1     1  666666.0
3   2     1     2   99999.0
4   1     2     1      -1.0
5   1     2     1      -1.0
6   1     2     2      -1.0
7   2     2     1   99999.0
8   2     2     2   99999.0

Sample data

df=pd.DataFrame({'date':[1,1,2,2],'id':[2,2,1,1],'type':[2,1,1,1],'value':[99999,666666,-1,-1]})
df=df.set_index(['date', 'id', 'type'])
new_idx = pd.MultiIndex.from_product([[1,2], [1,2],[1,2]])

Upvotes: 2

Related Questions