Reputation: 1052
I have two dataframes:
DF1
A B
'a' 'x'
'b' 'y'
'c' 'z'
DF2
Col1 Col2
'j' 'm'
'a' 'x'
'k' 'n'
'b' 'y'
And want to look up if the rows of DF1 are contained in DF2, and add that column Bool_col to DF1, like this.
DF1
A B Bool_col
'a' 'x' True
'b' 'y' True
'c' 'z' False
I've tried by looking up the concatenation of A and B in the concatenation-list of Col1 and Col2, but my data is giving me unexpected trouble. Any help on how to do this without concatenating columns?
Upvotes: 2
Views: 193
Reputation: 13401
Use pandas.merge and numpy.where
df = df1.merge(df2, how='left', indicator=True, left_on=['A','B'], right_on=['Col1','Col2'])
df['Bool_col'] = np.where(df['_merge']=='both', True, False)
df.drop(['_merge','Col1','Col2'], 1, inplace=True)
print(df)
Output:
A B Bool_col
0 a x True
1 b y True
2 c z False
Edit
As per @cs95 suggested in comments, np.where
is unnecessary here.
You can simply do
df1['Bool_col'] = df['_merge']=='both'
# df.drop(['_merge','Col1','Col2'], 1, inplace=True)
Upvotes: 3
Reputation: 402353
Use merge
with the indicator
argument, then check what rows show "both".
df1['Bool_col'] = (df1.merge(df2,
how='left',
left_on=['A', 'B'],
right_on=['Col1', 'Col2'],
indicator=True)
.eval('_merge == "both"'))
df1
A B Bool_col
0 'a' 'x' True
1 'b' 'y' True
2 'c' 'z' False
Upvotes: 3