Vineet Bansal
Vineet Bansal

Reputation: 511

What's the order of __hash__ and __eq__ evaluation for a Python dict?

I'm trying to understand what Python dictionaries must be doing internally to locate a key. It seems to me that hash would be evaluated first, and if there's a collision, Python would iterate through the keys till it finds one for which eq returns True. Which makes me wonder why the following code works (test code only for understanding the internals):

class MyClass(object):
    def __eq__(self, other):
        return False

    def __hash__(self):
        return 42

if __name__=='__main__':

    o1 = MyClass()
    o2 = MyClass()
    d = {o1: 'o1', o2: 'o2'}
    assert(o1 in d)      # 1
    assert(d[o1]=='o1')  # 2
    assert(o2 in d)      # 3
    assert(d[o2]=='o2')  # 4

Shouldn't the dictionary be unable to find the correct key (returning either 'o1' or 'o2' in both cases #2 and #4, or throwing an error, depending on internal implementation). How is it able to land on the correct key in both cases, when it should never be able to correctly 'equate' the keys (since eq returns False).

All the documentation I've seen on hashing always mentions hash and eq together, never cmp, ne etc, which makes me think these 2 are the only ones that play a role in this scenario.

Upvotes: 5

Views: 802

Answers (1)

user2357112
user2357112

Reputation: 280973

Anything you use as a dict key has to satisfy the invariant that bool(x == x) is True. (I would have just said x == x, but there are reasonable objects for which that isn't even a boolean.)

The dict assumes that this will hold, so the routine it uses to check key equality actually checks object identity first before using ==. This preliminary check is an implementation detail; you should not rely on it happening or not happening.

Reasonable objects for which (x == x) is not True include float('nan') and numpy.array([1, 2]).

Upvotes: 5

Related Questions