Ach113
Ach113

Reputation: 1825

Pandas join two dataframes with condition

I want to join two dataframes together, both dataframes have date columns (df1[date1], df2[date2]). I want the joined dataframe to satisfy this condition df2[date2] > df1[date1]. Second dataframe does not have any duplicates but first one does, so this does not work as expected:

I know for certain that for every date in df2 there is a date in df1 which satisfies this condition. But I cannot figure out how to join them properly. I have tried doing this:

joined = df1.join(df2, how='inner')
joined = joined.query('date2 > date1')

But since df1 has entries with duplicate id-s the way they align after join results in bunch of rows not satisfying the condition, so I get left with smaller database. How can I accomplish this?

Upvotes: 1

Views: 1089

Answers (1)

Poe Dator
Poe Dator

Reputation: 4893

based on your clairification I sugegst the following solution:

1) concatenate (not join) the 2 dataframes.

df12 =  pd.concat([df1, df2], axis=1)

I assume that the indices match. If not - reindex on id or join on id.

2) filter the rows that match criteria

df12 = df12[df12['date2'] > df12['date1]]

Upvotes: 1

Related Questions