Reputation: 110
I have a DataFrame with a non-unique sorted datetime index where I need to find the next row after a specific match on some columns of data.
I can find the correct row with DataFrame.query() which gives me a new DataFrame, but I don't know how I can locate where this row is in the original DataFrame. Here is an example:
import pandas as pd
import numpy as np
from datetime import datetime
ts_index = [
datetime.strptime('2016-06-19 22:50:22.189', '%Y-%m-%d %H:%M:%S.%f'),
datetime.strptime('2016-06-19 22:50:22.189', '%Y-%m-%d %H:%M:%S.%f'),
datetime.strptime('2016-06-19 22:50:22.610', '%Y-%m-%d %H:%M:%S.%f')
]
bid_price = [ 77.693, 77.692, 77.692 ]
bid_qty = [ 50.0, 100.0, 50.0 ]
ask_price = [ 77.709, 77.709, 77.709 ]
ask_qty = [ 50.0, 50.0, 50.0 ]
df = pd.DataFrame(index=ts_index, data={'BID_PRICE': bid_price,
'BID_QTY': bid_qty, 'ASK_PRICE': ask_price, 'ASK_QTY': ask_qty})
most_recent_match = df.query('(BID_PRICE == 77.692) and (BID_QTY == 100.0)').tail(1)
print most_recent_match
Is it possible to search / locate a position in a DataFrame using an entire row (index and columns)?
Upvotes: 1
Views: 5300
Reputation: 7997
Does this work? Just reset the index, and identified the index of the row you're after
df = pd.DataFrame(index=ts_index, data={'BID_PRICE': bid_price,
'BID_QTY': bid_qty, 'ASK_PRICE': ask_price, 'ASK_QTY': ask_qty})
df.reset_index(inplace = True)
most_recent_match = df.query('(BID_PRICE == 77.692) and (BID_QTY == 100.0)').tail(1)
df.ix[most_recent_match.index[0]]
Upvotes: 2
Reputation: 879341
You could create a boolean mask, then shift it down by one row:
mask = ((df['BID_PRICE'] == 77.692) & (df['BID_QTY'] == 100.0)).shift(1)
df.loc[mask]
yields
In [17]: df.loc[mask]
Out[17]:
ASK_PRICE ASK_QTY BID_PRICE BID_QTY
2016-06-19 22:50:22.610 77.709 50.0 77.692 50.0
Upvotes: 1