TimothyWhite
TimothyWhite

Reputation: 1

Determine whether a data element in a Pandas Dataframe is present in another Dataframe

I have two Pandas Dataframes (df1 and df2), both of which have identical structures, and both of which have several hundred thousand rows.

I'd like to update a field for each row indicating whether the ID for the row is found anywhere in a field in the other dataframe.

df1 = pd.DataFrame([['AAA',''],['BBB',''],['CCC','']], columns=['ID','Match'])
df2 = pd.DataFrame([['FFF',''],['BBB',''],['AAA',''],['BBB','']], columns=['ID','Match'])

And I'd like to end up with a result that looks like:

ID  Match
FFF N
BBB Y
AAA Y
BBB Y

Upvotes: 0

Views: 41

Answers (1)

user7864386
user7864386

Reputation:

You could join the IDs in df1 and use str.contains to identify the IDs that contains any ID from df1; then use np.where to assign "Y" if there is a match, "N" otherwise:

df2['Match'] = np.where(df2['ID'].str.contains('|'.join(df1['ID'])), 'Y', 'N')

Output:

    ID Match
0  FFF     N
1  BBB     Y
2  AAA     Y
3  BBB     Y

Upvotes: 1

Related Questions