Reputation: 5357
I have found this nice function pandas.merge_asof
.
From the documentation
pandas.merge_asof(left, right, on=None, left_on=None, right_on=None)
Parameters:
left : DataFrame
right : DataFrame
on : label
Field name to join on. Must be found in both DataFrames.
The data MUST be ordered.
Furthermore this must be a numeric column,such as datetimelike, integer, or float.
On or left_on/right_on must be given.
and it works as expected.
However, my merged dataframe keeps as columns on
only the one that originally was in left
. I would need to keep them both, so to have
mydf=pandas.merge_asof(left, right, on='Time')
and mydf
to contain both Time
from left
and right
Example data:
a=pd.DataFrame(data=pd.date_range('20100201', periods=100, freq='6h3min'),columns=['Time'])
b=pd.DataFrame(data=
pd.date_range('20100201', periods=24, freq='1h'),columns=['Time'])
b['val']=range(b.shape[0])
out=pd.merge_asof(a,b,on='Time',direction='forward',tolerance=pd.Timedelta('30min'))
Upvotes: 22
Views: 8718
Reputation: 21
One trick that I found while working on time-series data is this function. To articulate what I have done is perform an outer join on both dataframes. Suppose the entry against some "Date and Time" is not in the other dataframe. In that case, we copied the value it had previously, as I assumed that the value in the other dataframe stayed whatever it was before that time.
def merger(df1,df2):
df1['Date and Time'] = pd.to_datetime(df1['Date and Time'], errors='coerce')
df2['Date and Time'] = pd.to_datetime(df2['Date and Time'], errors='coerce')
df = pd.merge(df1,df2,on = 'Date and Time', how='outer')
df.fillna(method = 'ffill',inplace = True)
return df
Hope it Helps.
Upvotes: 2
Reputation: 863341
I think one possible solution is rename columns:
out = pd.merge_asof(a.rename(columns={'Time':'Time1'}),
b.rename(columns={'Time':'Time2'}),
left_on='Time1',
right_on='Time2',
direction='forward',
tolerance=pd.Timedelta('30min'))
print (out.head())
Time1 Time2 val
0 2010-02-01 00:00:00 2010-02-01 0.0
1 2010-02-01 06:03:00 NaT NaN
2 2010-02-01 12:06:00 NaT NaN
3 2010-02-01 18:09:00 NaT NaN
4 2010-02-02 00:12:00 NaT NaN
Upvotes: 24