Rocky
Rocky

Reputation: 105

Pandas: Nearest merging

I have two dataframes. I would like to merge the second dataframe to the first one on nearest time point.

Example: (df1)

user    t   
Ajay    2020-05-05 18:00:00
Ajay    2020-05-05 20:00:00
Ajay    2020-05-06 17:15:00
Ajay    2020-05-06 20:30:00
Ajay    2020-05-06 21:15:00

Example: (df2)

user    t   flag
Ajay    2020-05-05 18:04:00 10
Ajay    2020-05-05 18:12:00 7
Ajay    2020-05-05 20:02:00 9
Ajay    2020-05-06 17:13:00 2
Ajay    2020-05-06 17:45:00 5
Ajay    2020-05-06 20:30:00 8
Ajay    2020-05-06 21:12:00 9

df2 should be merged on df1 on nearest time stamp. Based on my limited understanding, pandas merge only on exact value. is it possible?

expected output:

user    t    flag
Ajay    2020-05-05 18:00:00   10
Ajay    2020-05-05 20:00:00   9
Ajay    2020-05-06 17:15:00   2
Ajay    2020-05-06 20:30:00   8  
Ajay    2020-05-06 21:15:00   9

Upvotes: 2

Views: 73

Answers (1)

jezrael
jezrael

Reputation: 863321

Use merge_asof with direction='nearest' parameter:

df1['t'] = pd.to_datetime(df1['t'])
df2['t'] = pd.to_datetime(df2['t'])

df = pd.merge_asof(df1, df2, on='t', direction='nearest')
print (df)
  user_x                   t user_y  flag
0   Ajay 2020-05-05 18:00:00   Ajay    10
1   Ajay 2020-05-05 20:00:00   Ajay     9
2   Ajay 2020-05-06 17:15:00   Ajay     2
3   Ajay 2020-05-06 20:30:00   Ajay     8
4   Ajay 2020-05-06 21:15:00   Ajay     9

If need also merge by User add parameter by:

df1['t'] = pd.to_datetime(df1['t'])
df2['t'] = pd.to_datetime(df2['t'])

df = pd.merge_asof(df1, df2, on='t', by='user', direction='nearest')
print (df)
   user                   t  flag
0  Ajay 2020-05-05 18:00:00    10
1  Ajay 2020-05-05 20:00:00     9
2  Ajay 2020-05-06 17:15:00     2
3  Ajay 2020-05-06 20:30:00     8
4  Ajay 2020-05-06 21:15:00     9

Upvotes: 2

Related Questions