Christopher King
Christopher King

Reputation: 10961

Objects have same hash, dictionary not recognizing as same

I have two objects which represent the same one. I even insured they had the same hash. I still got an error though from a dictionary:

>>> hash(one)
1098414562
>>> hash(one+zero)
1098414562
>>> a={one:1}
>>> a[one+zero]

Traceback (most recent call last):
  File "<pyshell#25>", line 1, in <module>
    a[one+zero]
KeyError: {{|}|}

What else do I have to do to ensure the dictionary reconizes it as the same key?

Upvotes: 3

Views: 771

Answers (3)

John La Rooy
John La Rooy

Reputation: 304443

This is called a hash-collision. The hash is just a relatively efficient way to distribute the keys across the dictionary. It doesn't guarantee uniqueness. When you look up a key, all the keys with a matching hash need to be considered, and will use the __eq__ method to determine if they really match or not.

Aside: It's possible to have a class that always returns the same hash for any of it's instances. However this destroys the usual O(1) lookup complexity

You can easily see the linear time lookups by experimenting with different values of n here

class MyInt(int):
    def __hash__(self):
        return hash(1)

d = {MyInt(i): i for i in xrange(n)}

d[MyInt(1)]  # This will take O(n) time since all values have the same hash

Upvotes: 0

Pavel Anossov
Pavel Anossov

Reputation: 62948

To be properly hashable dict keys, the objects must also define __eq__() or __cmp__(). They must compare equal to be recognized as the same key.

If the objects have the same hash, but do not compare equal, a hash collision is assumed, and they go separately in the same hash bucket.

When an object is looked up by hash, all objects in the matching hash bucket are compared to it, if none are equal, it's a KeyError.

Upvotes: 3

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799310

If a class does not define a __cmp__() or __eq__() method it should not define a __hash__() operation either; if it defines __cmp__() or __eq__() but not __hash__(), its instances will not be usable in hashed collections. If a class defines mutable objects and implements a __cmp__() or __eq__() method, it should not implement __hash__(), since hashable collection implementations require that a object’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).

source

Upvotes: 2

Related Questions