Anthony
Anthony

Reputation: 135

Pandas - broadcasting daily data across intraday data

I have a frame of daily data and a frame of intraday data, and I'd like to add the two frames by broadcasting the daily data across each day. Minimal example below:

import numpy as np
import pandas as pd

cols = ['A', 'B']
days = pd.date_range('1/1/2000', periods=2, freq='D')
df_d = pd.DataFrame(np.arange(4).reshape((2, 2)), index=days, columns=cols)
hours = pd.date_range('1/1/2000', periods=4, freq='12H')
df_h = pd.DataFrame(np.arange(8).reshape((4, 2)), index=hours, columns=cols)

target = pd.DataFrame([[0, 2],[2, 4],[6, 8],[8, 10]], index=hours, columns=cols)
>>> df_d                                                                                                                                                                                              
            A  B
2000-01-01  0  1
2000-01-02  2  3

>>> df_h                                                                                                                                                                                          
                     A  B
2000-01-01 00:00:00  0  1
2000-01-01 12:00:00  2  3
2000-01-02 00:00:00  4  5
2000-01-02 12:00:00  6  7

>>> target                                                                                                                                                                                            
                     A   B
2000-01-01 00:00:00  0   2
2000-01-01 12:00:00  2   4
2000-01-02 00:00:00  6   8
2000-01-02 12:00:00  8  10

So I would want to do target = df_h "+" df_d in a robust way as the intraday timestamps could change and there could be NaNs in the data. I tried reindexing df_d to hours and then forward filling, but this doesn't inherently respect daily boundaries and is fragile to missing data.

Upvotes: 2

Views: 234

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can use .add after normalizing the index of df_h:

i = df_h.index
df_h.set_index(i.floor('D')).add(df_d, fill_value=0).set_index(i)

                     A   B
2000-01-01 00:00:00  0   2
2000-01-01 12:00:00  2   4
2000-01-02 00:00:00  6   8
2000-01-02 12:00:00  8  10

Upvotes: 2

Related Questions