Prasanth NSR
Prasanth NSR

Reputation: 21

How to compare 3 or more DataFrames for equality in Pandas?

Ex -: I have 3 data-frames like -: titanic & titanic_new & titanic_copy (which have identical data)

I have used following code to compare 3 data-frames & I got expected result -:

(titanic.equals(titanic_copy)) and (titanic.equals(titanic_new)) and (titanic_copy.equals(titanic_new))

Output -: True

Is there any optimal way to compare 3 data-frames (or) any pre-defined method to compare 3 or more data-frames ?

TIA

Upvotes: 2

Views: 255

Answers (1)

cs95
cs95

Reputation: 402473

This expression returns true if all your DataFrames are equal:

all(x.equals(y) for x, y in zip(df_list[:-1], df_list[1:]))

To understand why this works, consider

df_list = [dfA, dfB, dfC]

Our expression computes the following:

dfA == dfB
dfB == dfC

If both these conditions are True, we know all frames are equal (because of transitivity - if A == B and B == C then A == C, and so on).


Minimal Example

df = pd.DataFrame({'A': [1, 2, 3]}, index=['a', 'b', 'c'])
df2 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
dfl1 = [df, df, df, df, df]
dfl2 = [df2, df, df2]

all(x.equals(y) for x, y in zip(dfl1[1:], dfl1[:-1]))
# True

all(x.equals(y) for x, y in zip(dfl2[1:], dfl2[:-1]))
# False

Upvotes: 2

Related Questions