PH82
PH82

Reputation: 359

Pandas Dataframe Comparison and Floating Point Precision

I'm looking to compare two dataframes which should be identical. However due to floating point precision I am being told the values don't match. I have created an example to simulate it below. How can I get the correct result so the final comparison dataframe returns true for both cells?

a = pd.DataFrame({'A':[100,97.35000000001]})
b = pd.DataFrame({'A':[100,97.34999999999]})
print a

   A  
0  100.00  
1   97.35  

print b

   A  
0  100.00  
1   97.35  

print (a == b)

   A  
0  True  
1  False  

Upvotes: 21

Views: 11208

Answers (2)

EdGaere
EdGaere

Reputation: 1474

You can use Pandas built-in assert_frame_equal, that automagically performs the numpy isclose() for floating point columns. The advantage is that you can pass an entire dataframe with mixed column types.

For fine tuning see arguments rtol and atol.

from pandas.testing import assert_frame_equal

assert_frame_equal(df1, df2)

Upvotes: 3

EdChum
EdChum

Reputation: 394459

OK you can use np.isclose for this:

In [250]:
np.isclose(a,b)

Out[250]:
array([[ True],
       [ True]], dtype=bool)

np.isclose takes relative tolerance and absolute tolerance. These have default values: rtol=1e-05, atol=1e-08 respectively

Upvotes: 22

Related Questions