guarrana
guarrana

Reputation: 79

How to drop rows in one DataFrame based on one similar column in another Dataframe that has a different number of rows

I have two DataFrames that are completely dissimilar except for certain values in one particular column:

df
  First  Last   Email             Age
0 Adam   Smith  [email protected]   30
1 John   Brown  [email protected]  35
2 Joe    Max    [email protected]  40
3 Will   Bill   [email protected]  25
4 Johnny Jacks  [email protected]  50
df2
  ID   Location  Contact
0 5435 Austin    [email protected]
1 4234 Atlanta   [email protected]
2 7896 Toronto   [email protected]

How would I go about finding the matching values in the Email column of df and the Contact column of df2, and then dropping the whole row in df based on that match?

Output I'm looking for (index numbering doesn't matter):

df1
  First  Last   Email             Age
1 John   Brown  [email protected]  35
3 Will   Bill   [email protected]  25

I've been able to identify matches using a few different methods like:

Changing the column names to be identical

common = df.merge(df2,on=['Email'])
df3 = df[(~df['Email'].isin(common['Email']))]

But df3 still shows all the rows from df.

I've also tried:

common = df['Email'].isin(df2['Contact'])
df.drop(df[common].index, inplace = True)

And again, identifies the matches but df still contains all original rows.

So the main thing I'm having difficulty with is updating df with the matches dropped or creating a new DataFrame that contains only the rows with dissimilar values when comparing the Email column from df and the Contact column in df2. Appreciate any suggestions.

Upvotes: 0

Views: 68

Answers (1)

kaihami
kaihami

Reputation: 815

As mentioned in the comments(@Arkadiusz), it is enough to filter your data using the following

df3 = df[(~df['Email'].isin(df2.Contact))].copy()
print(df3)

Upvotes: 1

Related Questions