amrutha
amrutha

Reputation: 193

lookup from multiple columns pandas

i have 2 dataframes df1 & df2 as given below:

df1:

a
T11552
T11559
T11566
T11567
T11569
T11594
T11604
T11625

df2:

a   b
T11552  T11555
T11560  T11559
T11566  T11562
T11568  T11565
T11569  T11560
T11590  T11594
T11604  T11610
T11621  T11625
T11633  T11631
T11635  T11634
T13149  T13140

I want to have a new dataframe df3 where i want to search the value of df1 in df2. if the value is present in df2, i want to add new column in df1 returning True/False as shown below.

df3:

a   v
T11552  TRUE
T11559  TRUE
T11566  TRUE
T11567  FALSE
T11569  TRUE
T11594  TRUE
T11604  TRUE
T11625  TRUE
T11633  TRUE
T11634  TRUE

Upvotes: 1

Views: 1927

Answers (2)

jpp
jpp

Reputation: 164693

Try this:

df3 = df1[['a']].copy()
df3['v'] = df3['a'].isin(set(df2.values.ravel()))

The above code will:

  1. Create a new dataframe using column 'a' from df1.
  2. Create a Boolean column 'v' testing the existence of each value of column 'a' versus values in df2 via set and numpy.ravel.

Upvotes: 0

jezrael
jezrael

Reputation: 862731

Use assign for new DataFrame with isin and converting all values to flatten array by ravel, for improve performance is possible check only unique values and also check by in1d:

df3 = df1.assign(v = lambda x: x['a'].isin(np.unique(df2.values.ravel())))
#alternative solution
#df3 = df1.assign(v = lambda x: np.in1d(x['a'], np.unique(df2[['a','b']].values.ravel())))

#if need specify columns in df2 for check
df3 = df1.assign(v = lambda x: x['a'].isin(np.unique(df2[['a','b']].values.ravel())))
print (df3)
        a      v
0  T11552   True
1  T11559   True
2  T11566   True
3  T11567  False
4  T11569   True
5  T11594   True
6  T11604   True
7  T11625   True

Upvotes: 3

Related Questions