Uzair
Uzair

Reputation: 51

Finding non-matching rows between two dataframes

I have a scenario where I want to find non-matching rows between two dataframes. Both dataframes will have around 30 columns and an id column that uniquely identify each record/row. So, I want to check if a row in df1 is different from the one in df2. The df1 is an updated dataframe and df2 is the previous version.

I have tried an approach pd.concat([df1, df2]).drop_duplicates(keep=False) , but it just combines both dataframes. Is there a way to do it. I would really appreciate the help.

The sample data looks like this for both dfs.

id user_id type status

There will be total 39 columns which may have NULL values in them.

Thanks.

P.S. df2 will always be a subset of df1.

Upvotes: 2

Views: 2251

Answers (1)

jitvimol
jitvimol

Reputation: 72

If your df1 and df2 has the same shape, you may easily compare with this code.

df3 = pd.DataFrame(np.where(df1==df2,True,False), columns=df1.columns)

And you will see boolean output "False" for not matching cell value.

Upvotes: 1

Related Questions