Reputation: 835
I have two data frames. One dataframe (A) looks like:
Name begin stop ID
Peter 30 150 1
Hugo 4500 6000 2
Jennie 300 700 3
The other dataframe (B) looks like
entry string
89 aa
568 bb
938437 cc
I want to accomplish two tasks here:
lst = [0,1]. ### because row 0 of B falls in interval of row 1 in A and row 1 of B falls in interval of row 3 of A.
entry string
938437 cc
How can I accomplish these two tasks?
Upvotes: 1
Views: 246
Reputation: 24304
Make use of between()
method and tolist()
method to get list of indexes:
lst=B[B['entry'].between(A.loc[0,'begin'],A.loc[len(A)-1,'stop'])].index.tolist()
Finally make use of isin()
method and boolean masking:
result=B[~B.index.isin(lst)]
Upvotes: 1
Reputation: 29635
You can use merge_asof
l = (pd.merge_asof(dfB['entry'].reset_index() #to keep original index after merge
.sort_values('entry'), #mandatory to use this merge_asof
dfA[['begin','stop']].sort_values('begin'),
left_on='entry', right_on='begin',
direction='backward') # begin lower than entry
.query('stop >= entry') # keep only where entry lower than stop
['index'].tolist()
)
print(l)
# Int64Index([0, 1], dtype='int64')
new_df = dfB.loc[dfB.index.difference(l)]
print(new_df)
# entry string
# 2 938437 cc
Now if you don't need the list onf index and that you real goal is the new_df, then you can do directly
new_df = (pd.merge_asof(dfB.sort_values('entry'),
dfA[['begin','stop']].sort_values('begin'),
left_on='entry', right_on='begin',
direction='backward')
.query('stop < entry') #here different inegality
.drop(['begin','stop'], axis=1) #clean the result
.reset_index(drop=True)
)
print(new_df)
Upvotes: 1