Python filter false results out of pandas row equality

Background

I have two dataframes that are both identical in structure, that is the rows and columns number along with the column names.

Example

I'm trying to compare them iteratively highlighting only the difference using the below method.

df1 = pd.DataFrame({'Name': ['Adam','Sara','John'], 'Age': [22, 25, 20], 'City':['New York','',None]})
df2 = pd.DataFrame({'Name': ['Adam','Sara','John'], 'Age': [22, 19, 20], 'City':['New York',None,'New Mexico']})

I am able to do such a comparison, but I can't restrict the equality to return only the False results.

df1.loc[1].fillna('') == df2.loc[1].fillna('') 

Name     True
Age     False
City     True
Name: 1, dtype: bool

I'm planning to plug it iteratively in a loop, it would be nice to highlight both values of False.

for i in range(1,df1.shape[0]):
    df1.loc[i].fillna('') == df2.loc[i].fillna('') 

Upvotes: 0

Views: 43

Answers (1)

EEtch
EEtch

Reputation: 310

You have to filter out the Boolean Series created by the comparison:

equality_series = df1.loc[1].fillna('') == df2.loc[1].fillna('')
equality_series[~equality_series]

Output:

Age    False
Name: 1, dtype: bool

Upvotes: 1

Related Questions