NinjaGaiden
NinjaGaiden

Reputation: 3146

pandas compare two dataframes with criteria

I have two dataframes. df1 and df2. I would like to get whatever values are common from df1 and df2 and the dt value of df2 must be greater than df1's dt value In this case, the expected value is fee

df1 = pd.DataFrame([['2015-01-01 06:00','foo'], 
    ['2015-01-01 07:00','fee'], ['2015-01-01 08:00','fum']],
    columns=['dt', 'value'])
df1.dt=pd.to_datetime(df1.dt)

df2=pd.DataFrame([['2015-01-01 06:10','zoo'],
    ['2015-01-01 07:10','fee'],['2015-01-01 08:10','feu'],
    ['2015-01-01 09:10','boo']], columns=['dt', 'value'])
df2.dt=pd.to_datetime(df2.dt)

Upvotes: 0

Views: 535

Answers (1)

EdChum
EdChum

Reputation: 394459

One way would be to merge on 'value' column so this will produce only matching rows, you can then filter the merged df using the 'dt_x', 'dt_y' columns:

In [15]:

merged = df2.merge(df1, on='value')
merged[merged['dt_x'] > merged['dt_y']]
Out[15]:
                 dt_x value                dt_y
0 2015-01-01 07:10:00   fee 2015-01-01 07:00:00

You can't do something like the following because the lengths don't match:

df2[ (df2['value'].isin(df1['value'])) & (df2['dt'] > df1['dt']) ]

raises:

ValueError: Series lengths must match to compare

Upvotes: 2

Related Questions