Reputation: 54068
I came across this interesting example today
class TestableEq(object):
def __init__(self):
self.eq_run = False
def __eq__(self, other):
self.eq_run = True
if isinstance(other, TestableEq):
other.eq_run = True
return self is other
>>> eq = TestableEq()
>>> eq.eq_run
False
>>> eq == eq
True
>>> eq.eq_run
True
>>> eq = TestableEq()
>>> eq is eq
True
>>> eq.eq_run
False
>>> [eq] == [eq]
True
>>> eq.eq_run # Should be True, right?
False
>>> (eq,) == (eq,) # Maybe with tuples?
True
>>> eq.eq_run
False
>>> {'eq': eq} == {'eq': eq} # dicts?
True
>>> eq.eq_run
False
>>> import numpy as np # Surely NumPy works as expected
>>> np.array([eq]) == np.array([eq])
True
>>> eq.eq_run
False
So it seems that comparisons inside containers works differently in Python. I would expect that the call to ==
would use each object's implementation of __eq__
, otherwise what's the point? Additionally
class TestableEq2(object):
def __init__(self):
self.eq_run = False
def __eq__(self, other):
self.eq_run = True
other.eq_run = True
return False
>>> eq = TestableEq2()
>>> [eq] == [eq]
True
>>> eq.eq_run
False
>>> eq == eq
False
>>> eq.eq_run
True
Does this mean that Python uses is
from within container's implementations of __eq__
instead? Is there a way around this?
My use case is that I am building a data structure inheriting from some of the collections
ABCs and I want to write tests to make sure my structure is behaving correctly. I figured it would be simple to inject a value that recorded when it was compared, but to my surprise the test failed when checking to ensure that comparison occurred.
EDIT: I should mention that this is on Python 2.7, but I see the same behavior on 3.3.
Upvotes: 7
Views: 813
Reputation: 122091
Python's testing of equality for sequences goes as follows:
Lists identical?
/ \
Y N
/ \
Equal Same length?
/ \
Y N
/ \
Items identical? Not equal
/ \
Y N
/ \
Equal Items equal?
/ \
Y N
/ \
Equal Not equal
You can see that the equality of the items at each position is tested only if the two sequences are the same length but the items at each position are not identical. If you want to force equality checks to be used, you need e.g.:
all(item1 == item2 for item1, item2 in zip(list1, list2))
Upvotes: 10
Reputation: 31339
CPython's underlying implementation will skip the equality check (==
) for items in a list if items are identical (is
).
CPython uses this as an optimization assuming identity implies equality.
This is documented in PyObject_RichCompareBool, which is used to compare items:
Note: If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.
From the listobject.c implementation:
/* Search for the first index where items are different */
for (i = 0; i < Py_SIZE(vl) && i < Py_SIZE(wl); i++) {
int k = PyObject_RichCompareBool(vl->ob_item[i],
wl->ob_item[i], Py_EQ);
// k is 1 if objects are the same
// because of RichCmopareBool's behaviour
if (k < 0)
return NULL;
if (!k)
break;
}
As you can see as long as RichCompareBool
is 1
(True
) the items are not checked.
And from object.c's implementation of PyObject_RichCompareBool
:
/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}
// ... actually deep-compare objects
To override this you'll have to compare the items manually.
Upvotes: 14
Reputation: 1165
When comparing two lists, the cPython implementation short-circuits member comparisons using object equality (obj1 is obj2
), because, according to a comment in the code:
/* Quick result when objects are the same.
Guarantees that identity implies equality. */
If the two objects are not exactly the same object, then cPython does a rich compare, using __eq__
if implemented.
Upvotes: 3
Reputation: 61925
If x is y
there is no reason to call x == y
, by contract of ==
. Python is taking this shortcut.
This can be verified/disprove this by creating an eq1
and an eq2
in the tests and then using [eq1] == [eq2]
.
class TestableEq(object):
def __init__(self):
self.eq_run = False
def __eq__(self, other):
self.eq_run = True
return True # always assume equals for test
eq1 = TestableEq()
eq2 = TestableEq()
eq3 = TestableEq()
print [eq1] == [eq2] # True
print eq1.eq_run # True - implies e1 == e2
print eq2.eq_run # False - but NOT e2 == e1
print [eq3] == [eq3] # True
print eq3.eq_run # False - implies NO e3 == e3
When the items are is
there is no ==
involved.
The difference with the dictionaries can be explained similarly.
Upvotes: 4