Reputation: 59
I am a newbie to python and dataframes. I am currently trying to compare 2 dataframes with the assert_frame_equal()
function .
df1= a b
0 1 3
1 2 4
df2= a b
0 2 3.0
1 2 4.0
code:
import pandas as pd
from pandas._testing import assert_frame_equal
def test_compare_src_trg():
df1 = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
df2 = pd.DataFrame({'a': [2, 2], 'b': [3.0, 4.0]})
pd.testing.assert_frame_equal(df1, df2)
When this is run , the assertion fails at the first column as value "1" is not equal "2" which is correct. But I want the assertion to run on all items of dataframe and give the overall pass/fail results .
----------------------------------
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 0] (column name="a") are different
DataFrame.iloc[:, 0] (column name="a") values are different (50.0 %)
[index]: [0, 1]
[left]: [1, 2]
[right]: [2, 2]
Process finished with exit code 1
Upvotes: 1
Views: 3268
Reputation: 21898
I think you have to use something else like compare
. You can see the full comparison and assert if the resulting Dataframe is empty to check if they are equal.
cp = df1.compare(df2)
# a
# self other
# 0 1.0 2.0
assert cp.empty, "Dataframes are not equal"
# AssertionError: Dataframes are not equal
Note
Can only compare identically-labeled (i.e. same shape, identical row and column labels) DataFrames
Upvotes: 1