bheklilr
bheklilr

Reputation: 54068

How does Python 2.7 compare items inside a list

I came across this interesting example today

class TestableEq(object):
    def __init__(self):
        self.eq_run = False
    def __eq__(self, other):
        self.eq_run = True
        if isinstance(other, TestableEq):
            other.eq_run = True
        return self is other

>>> eq = TestableEq()
>>> eq.eq_run
False
>>> eq == eq
True
>>> eq.eq_run
True
>>> eq = TestableEq()
>>> eq is eq
True
>>> eq.eq_run
False
>>> [eq] == [eq]
True
>>> eq.eq_run    # Should be True, right?
False
>>> (eq,) == (eq,)    # Maybe with tuples?
True
>>> eq.eq_run
False
>>> {'eq': eq} == {'eq': eq}    # dicts?
True
>>> eq.eq_run
False
>>> import numpy as np    # Surely NumPy works as expected
>>> np.array([eq]) == np.array([eq])
True
>>> eq.eq_run
False

So it seems that comparisons inside containers works differently in Python. I would expect that the call to == would use each object's implementation of __eq__, otherwise what's the point? Additionally

class TestableEq2(object):
    def __init__(self):
        self.eq_run = False
    def __eq__(self, other):
        self.eq_run = True
        other.eq_run = True
        return False

>>> eq = TestableEq2()
>>> [eq] == [eq]
True
>>> eq.eq_run
False
>>> eq == eq
False
>>> eq.eq_run
True

Does this mean that Python uses is from within container's implementations of __eq__ instead? Is there a way around this?

My use case is that I am building a data structure inheriting from some of the collections ABCs and I want to write tests to make sure my structure is behaving correctly. I figured it would be simple to inject a value that recorded when it was compared, but to my surprise the test failed when checking to ensure that comparison occurred.

EDIT: I should mention that this is on Python 2.7, but I see the same behavior on 3.3.

Upvotes: 7

Views: 813

Answers (4)

jonrsharpe
jonrsharpe

Reputation: 122091

Python's testing of equality for sequences goes as follows:

                 Lists identical?
                  /          \  
                 Y            N
                /              \
             Equal         Same length?
                            /       \  
                           Y         N
                          /           \
                  Items identical?   Not equal
                     /       \
                    Y         N
                   /           \
                Equal      Items equal?
                            /        \
                           Y          N
                          /            \
                       Equal        Not equal

You can see that the equality of the items at each position is tested only if the two sequences are the same length but the items at each position are not identical. If you want to force equality checks to be used, you need e.g.:

all(item1 == item2 for item1, item2 in zip(list1, list2))

Upvotes: 10

Reut Sharabani
Reut Sharabani

Reputation: 31339

CPython's underlying implementation will skip the equality check (==) for items in a list if items are identical (is).

CPython uses this as an optimization assuming identity implies equality.

This is documented in PyObject_RichCompareBool, which is used to compare items:

Note: If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.

From the listobject.c implementation:

/* Search for the first index where items are different */
for (i = 0; i < Py_SIZE(vl) && i < Py_SIZE(wl); i++) {
    int k = PyObject_RichCompareBool(vl->ob_item[i],
                                     wl->ob_item[i], Py_EQ);
    // k is 1 if objects are the same
    // because of RichCmopareBool's behaviour
    if (k < 0)
        return NULL;
    if (!k)
        break;
}

As you can see as long as RichCompareBool is 1 (True) the items are not checked.

And from object.c's implementation of PyObject_RichCompareBool:

/* Quick result when objects are the same.
   Guarantees that identity implies equality. */
if (v == w) {
    if (op == Py_EQ)
        return 1;
    else if (op == Py_NE)
        return 0;
}
// ... actually deep-compare objects

To override this you'll have to compare the items manually.

Upvotes: 14

William Jackson
William Jackson

Reputation: 1165

When comparing two lists, the cPython implementation short-circuits member comparisons using object equality (obj1 is obj2), because, according to a comment in the code:

/* Quick result when objects are the same.
   Guarantees that identity implies equality. */

If the two objects are not exactly the same object, then cPython does a rich compare, using __eq__ if implemented.

Upvotes: 3

user2864740
user2864740

Reputation: 61925

If x is y there is no reason to call x == y, by contract of ==. Python is taking this shortcut.

This can be verified/disprove this by creating an eq1 and an eq2 in the tests and then using [eq1] == [eq2].

Here is as example:

class TestableEq(object):
    def __init__(self):
        self.eq_run = False
    def __eq__(self, other):
        self.eq_run = True
        return True     # always assume equals for test

eq1 = TestableEq()
eq2 = TestableEq()
eq3 = TestableEq()

print [eq1] == [eq2]    # True
print eq1.eq_run        # True  - implies e1 == e2 
print eq2.eq_run        # False - but NOT e2 == e1

print [eq3] == [eq3]    # True
print eq3.eq_run        # False - implies NO e3 == e3

When the items are is there is no == involved.

The difference with the dictionaries can be explained similarly.

Upvotes: 4

Related Questions