Reputation: 1825
I want to join two dataframes together, both dataframes have date columns (df1[date1]
, df2[date2]
). I want the joined dataframe to satisfy this condition df2[date2] > df1[date1]
. Second dataframe does not have any duplicates but first one does, so this does not work as expected:
I know for certain that for every date in df2 there is a date in df1 which satisfies this condition. But I cannot figure out how to join them properly. I have tried doing this:
joined = df1.join(df2, how='inner')
joined = joined.query('date2 > date1')
But since df1 has entries with duplicate id-s the way they align after join results in bunch of rows not satisfying the condition, so I get left with smaller database. How can I accomplish this?
Upvotes: 1
Views: 1089
Reputation: 4893
based on your clairification I sugegst the following solution:
1) concatenate
(not join
) the 2 dataframes.
df12 = pd.concat([df1, df2], axis=1)
I assume that the indices match. If not - reindex on id or join
on id.
2) filter the rows that match criteria
df12 = df12[df12['date2'] > df12['date1]]
Upvotes: 1