Reputation: 135
I have a frame of daily data and a frame of intraday data, and I'd like to add the two frames by broadcasting the daily data across each day. Minimal example below:
import numpy as np
import pandas as pd
cols = ['A', 'B']
days = pd.date_range('1/1/2000', periods=2, freq='D')
df_d = pd.DataFrame(np.arange(4).reshape((2, 2)), index=days, columns=cols)
hours = pd.date_range('1/1/2000', periods=4, freq='12H')
df_h = pd.DataFrame(np.arange(8).reshape((4, 2)), index=hours, columns=cols)
target = pd.DataFrame([[0, 2],[2, 4],[6, 8],[8, 10]], index=hours, columns=cols)
>>> df_d
A B
2000-01-01 0 1
2000-01-02 2 3
>>> df_h
A B
2000-01-01 00:00:00 0 1
2000-01-01 12:00:00 2 3
2000-01-02 00:00:00 4 5
2000-01-02 12:00:00 6 7
>>> target
A B
2000-01-01 00:00:00 0 2
2000-01-01 12:00:00 2 4
2000-01-02 00:00:00 6 8
2000-01-02 12:00:00 8 10
So I would want to do target = df_h "+" df_d
in a robust way as the intraday timestamps could change and there could be NaNs in the data. I tried reindexing df_d
to hours
and then forward filling, but this doesn't inherently respect daily boundaries and is fragile to missing data.
Upvotes: 2
Views: 234
Reputation: 71689
You can use .add
after normalizing the index of df_h
:
i = df_h.index
df_h.set_index(i.floor('D')).add(df_d, fill_value=0).set_index(i)
A B
2000-01-01 00:00:00 0 2
2000-01-01 12:00:00 2 4
2000-01-02 00:00:00 6 8
2000-01-02 12:00:00 8 10
Upvotes: 2