FIltering Pandas Dataframe using vectorization

Question

I have a data frame with x rows and y colums, called df. I have another datafame df2 with less than x rows and y-1 colums. I want to filter df for rows that are identical with the rows of df2 from column 1 to y-1. Is there a way to do that in a vectorized fashion without iterating through the rows of df2?

Here is the code for a sample df:

import pandas
import numpy.random as rd
dates = pandas.date_range('1/1/2000', periods=8)
df = pandas.DataFrame(rd.randn(8, 5), index=dates, columns=['call/put', 'expiration', 'strike', 'ask', 'bid'])
df.iloc[2,4]=0
df.iloc[2,3]=0
df.iloc[3,4]=0
df.iloc[3,3]=0
df.iloc[2,2]=0.5
df=df.append(df.iloc[2:3])
df.iloc[8:9,3:5]=1
df.iloc[8:9,2:3]=0.6
df=df.append(df.iloc[8:9])
df.iloc[9,2]=0.4

df2 is calculated as follows:

df4=df[(df["ask"]==0) & (df["bid"]==0)]

Now I want to filter df for rows that look like those in df2 except column strike, which should have a value of 0.4. Filter process should be without iteration, because my real world df is very large.

FIltering Pandas Dataframe using vectorization

Answers (1)

Related Questions