Reputation: 287
I have two dataframes that are both identical in structure, that is the rows and columns number along with the column names.
I'm trying to compare them iteratively highlighting only the difference using the below method.
df1 = pd.DataFrame({'Name': ['Adam','Sara','John'], 'Age': [22, 25, 20], 'City':['New York','',None]})
df2 = pd.DataFrame({'Name': ['Adam','Sara','John'], 'Age': [22, 19, 20], 'City':['New York',None,'New Mexico']})
I am able to do such a comparison, but I can't restrict the equality to return only the False results.
df1.loc[1].fillna('') == df2.loc[1].fillna('')
Name True
Age False
City True
Name: 1, dtype: bool
I'm planning to plug it iteratively in a loop, it would be nice to highlight both values of False.
for i in range(1,df1.shape[0]):
df1.loc[i].fillna('') == df2.loc[i].fillna('')
Upvotes: 0
Views: 43
Reputation: 310
You have to filter out the Boolean Series created by the comparison:
equality_series = df1.loc[1].fillna('') == df2.loc[1].fillna('')
equality_series[~equality_series]
Output:
Age False
Name: 1, dtype: bool
Upvotes: 1