Reputation: 11
I got two Dataframes. The goal is to filter out rows in DF1 that have an Integer value that lies between any of the Integers in the ["Begin"] and ["End"] columns in any of the 37 rows in DF2.
DF1:
INDEX String IntValues
1 "string" 808091
2 "string" 1168262
3 "string" 1169294
... ... ...
647 "string" 14193661
648 "string" 14551918
DF2:
Index Begin End
1 1196482.2 1216529
2 1791819.7 1834887
3 2008405.1 2014344
... ... ...
36 14168540.0 14193933
37 14727507.1 14779605
I think it is possible to use something like :
df1[(df1["IntValues"] >=1196482.2 ) & (df1["IntValues"] <= 1216529),(... 36 more conditions)].
Is there a better way than just writing down these 37 conditions, like a variable for the begin and end values of that "filter window" ?
Edit: As requested a code sample, not from the original DF, but i hope it suffices.
d1 = {
"string":["String0", "String1", "String2", "String3", "String4", "String5", "String6", "String7", "String8", "String9", "String10", "String11", "String12", "String13", "String14"],
"timestamp":[1168262,1169294, 1184451, 1210449,1210543,1210607, 1644328,
1665732, 1694388,1817309,1822872,1825310,2093796,2182923,2209252 ],
"should be in": ["Yes", "Yes", "Yes", "No", "No","No", "yes","yes","yes","no","no", "no","yes","yes","no"]
}
df1 = pd.DataFrame(d1)
d2={
'begin' : [1196482.2,1791819.7,2199564.6],
'end' : [1216529,1834887,2212352]
}
df2 = pd.DataFrame(d2)
Upvotes: 1
Views: 115
Reputation: 1413
Try this:
df_final=[]
for i,j in zip(df2["Begin"],df2["End"]):
x=df1[(df1["IntValues"] >=i ) & (df1["IntValues"] <= j)]
df_final.append(x)
df_final=pd.concat(df_final,axis=0).reset_index(drop=True)
df_final=df_final.drop_duplicates()
Upvotes: 2