jr123456jr987654321
jr123456jr987654321

Reputation: 324

Get row with closest date in other dataframe pandas

I have 2 dataframes Dataframe1:

id date1
1 11-04-2022
1 03-02-2011
2 03-05-2222
3 01-01-2001
4 02-02-2012

and Dataframe2:

id date2 data data2
1 11-02-2222 1 3
1 11-02-1999 3 4
1 11-03-2022 4 5
2 22-03-4444 5 6
2 22-02-2020 7 8
...

What I would like to do is take the row from dataframe2 with the closest date to date1 in Dataframe1 but it has to fit the id, but the date has to be before the one of date1 The desired output would look like this:

id date1 date2 data data2
1 11-04-2022 11-03-2022 4 5
1 03-02-2011 11-02-1999 3 4
2 03-05-2222 22-02-2020 7 8

How would I do this using pandas?

Upvotes: 1

Views: 641

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195408

Try pd.merge_asof, but first convert date1, date2 to datetime and sort both timeframes:

df1["date1"] = pd.to_datetime(df1["date1"])
df2["date2"] = pd.to_datetime(df2["date2"])

df1 = df1.sort_values(by="date1")
df2 = df2.sort_values(by="date2")

print(
    pd.merge_asof(
        df1,
        df2,
        by="id",
        left_on="date1",
        right_on="date2",
        direction="nearest",
    ).dropna(subset=["date2"])
)

Upvotes: 5

Related Questions