Reputation: 1039
I am trying to unit test my code. I have a method that given a MySQL query, returns the result as a pandas dataframe. Note that in the database, all returned values in created
and external_id
are NULL. Here is the test:
def test_get_data(self):
### SET UP
self.report._query = "SELECT * FROM floor LIMIT 3";
self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
self.d = {'id': p.Series([1, 2, 3]),
'facility_id': p.Series([1, 1, 1]),
'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
'created': p.Series(['None', 'None', 'None']),
'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
'external_id': p.Series(['None', 'None', 'None'])
}
self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
self.df.fillna('None')
print(self.df)
### CODE UNDER TEST
result = self.report.get_data(self.report._cursor_web)
print(result)
### ASSERTIONS
assert_frame_equal(result, self.df)
Here is the console output (note the print statements in the test code. The manually constructed dataframe is on top, the one derived from the function being tested is on the bottom):
. id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
assert_frame_equal(result, self.df)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 3] are different
DataFrame.iloc[:, 3] values are different (100.0 %)
[left]: [None, None, None]
[right]: [None, None, None]
----------------------------------------------------------------------
Ran 1 test in 0.354s
FAILED (failures=1)
By my reckoning, column 'created' contains three string values of 'None' in both the left and right dataframes. Why is it asserting not equal?
Upvotes: 2
Views: 3246
Reputation:
Python also has a built-in constant None
that is different from the string 'None'
. From the docs:
None
The sole value of the type NoneType. None is frequently used to represent the absence of a value, as when default arguments are not passed to a function. Assignments to None are illegal and raise a SyntaxError.
In the case of comparing None
against 'None'
(None == 'None'
) the result will be False. Therefore, assert_frame_equal
will raise an AssertionError if one of the DataFrames contains None
but the other contains 'None'
.
Upvotes: 1