Tokyo
Tokyo

Reputation: 823

Compare every element between two dataframes

Assuming that I have two dataframes:

# df1
+-----------------------+
| Name_1 |Age| Location |
+-----------------------+
| A    | 18  |    UK    |
| B    | 19  |    US    |
+-----------------------+

# df2
+-------------------------+
| Name_2 | Age | Location |
+-------------------------+
| A    | 18  |    US      |
| B    | 19  |    US      |
+-------------------------+

How can I compare all of the elements and get a dataframe with boolean values that indicate whether the corresponding values match?

The desired output would be:

# desired
+-----------------------+
| Name | Age  | Location|
+-----------------------+
| A    | True |  False  |
| B    | True |  True   |
+-----------------------+

Upvotes: 1

Views: 340

Answers (1)

jezrael
jezrael

Reputation: 862781

If same number of rows and same columns names in both DataFrames create indices by name in both by DataFrame.set_index and then compare:

df11 = df1.set_index('name')
df22 = df2.set_index('name')
df = (df11 == df22).reset_index()

EDIT: If different only columns for index:

df11 = df1.set_index('Name_1')
df22 = df2.set_index('Name_2')
df = (df11 == df22).reset_index()
print (df)
  Name_1   Age  Location
0      A  True     False
1      B  True      True

If possible different another columns, but length of columns is still same and also length of index is necessary set same columns names in both - e.g. df22 columns by df11 columns:

print (df1)
  Name_1  Age1 Location1
0      A    18        UK
1      B    19        US

print (df2)
  Name_2  Age2 Location2
0      A    18        US
1      B    19        US

df11 = df1.set_index('Name_1')
df22 = df2.set_index('Name_2')
df22.columns = df11.columns
df = (df11 == df22).reset_index()
print (df)
  Name_1  Age1  Location1
0      A  True      False
1      B  True       True

Upvotes: 3

Related Questions