python/pandas: why the empty dataframe when using isin?

Question

I am trying to find where "code" from df2 appears in any of the "code1,code2,code3" columns in df1 and return the rows from df2 where that match was successful.

Here is my sample code:

df1 = pd.DataFrame(
        {
        'terms' : ['term1','term2'],
        'code1': ['1234x', '4321y'],
        'code2': ['2345x','5432y'],
        'code3': ['3456x','6543y']
        }
        )
df1 = df1[['terms'] + df1.columns[:-1].tolist()]

df2 = pd.DataFrame(
        {
        'name': ['Dan','Sara'],
        'rate': ['3','3.5'],
        'location': ['FL','OH'],
        'code': ['4444g','6543y']                           
         })
df2 = df2[['name','rate','location','code']]

#combining code1,code2,code3 into new column df1['allcodes']
df1['allcodes'] = df1[df1.columns[1:]].apply(lambda x: ','.join(x.dropna().astype(str)),axis=1)

print(df2[df2['code'].isin(df1['allcodes'])])

The desired result would be the row from df2 where the code was found in df1:

   name rate location   code
1  Sara  3.5       OH  6543y

The result is:

Empty DataFrame
Columns: [name, rate, location, code]
Index: []

why is this returning an empty dataframe?

piRSquared · Accepted Answer

Option 1
np.in1d
Indiscriminate check if a df2.code for a particular row is anywhere in df1

df2[np.in1d(df2.code, df1[['code1', 'code2', 'code3']])]

   name rate location   code
1  Sara  3.5       OH  6543y

Speed it up

df2[np.in1d(df2.code.values, df1[['code1', 'code2', 'code3']].values)]

Option 2
Check to see if df2.loc[x, 'code'] is in df1.loc[x]
We use the pd.DataFrame.eq method so we can pass the axis=0 parameter which checks if a series is equal to each column. Basically, axis=0 means, align the index (not columns).

df2[df1[['code1', 'code2', 'code3']].eq(df2.code, 0).any(1)]

   name rate location   code
1  Sara  3.5       OH  6543y

Speed it up

df2[(df1[['code1', 'code2', 'code3']].values == df2['code'].values[:, None]).any(1)]

python/pandas: why the empty dataframe when using isin?

Answers (2)

Related Questions