Reputation: 324
I have 2 dataframes Dataframe1:
id date1
1 11-04-2022
1 03-02-2011
2 03-05-2222
3 01-01-2001
4 02-02-2012
and Dataframe2:
id date2 data data2
1 11-02-2222 1 3
1 11-02-1999 3 4
1 11-03-2022 4 5
2 22-03-4444 5 6
2 22-02-2020 7 8
...
What I would like to do is take the row from dataframe2 with the closest date to date1 in Dataframe1 but it has to fit the id, but the date has to be before the one of date1 The desired output would look like this:
id date1 date2 data data2
1 11-04-2022 11-03-2022 4 5
1 03-02-2011 11-02-1999 3 4
2 03-05-2222 22-02-2020 7 8
How would I do this using pandas?
Upvotes: 1
Views: 641
Reputation: 195408
Try pd.merge_asof
, but first convert date1, date2 to datetime and sort both timeframes:
df1["date1"] = pd.to_datetime(df1["date1"])
df2["date2"] = pd.to_datetime(df2["date2"])
df1 = df1.sort_values(by="date1")
df2 = df2.sort_values(by="date2")
print(
pd.merge_asof(
df1,
df2,
by="id",
left_on="date1",
right_on="date2",
direction="nearest",
).dropna(subset=["date2"])
)
Upvotes: 5