Reputation: 16870
One of the classes I defined is used in a set()
to filter out equal objects. But it doesn't work as I expect it, so I obviously understand something wrong.
class Foo(object):
def __hash__(self):
return 7
x = set()
x.add(Foo())
assert len(x) == 1
x.add(Foo())
assert len(x) == 1 # AssertionError
I expect the set to consist of only one element, but it has two. Why is that?
Upvotes: 3
Views: 741
Reputation: 133574
Hash collisions are know to occur in sets (hash maps), no hash algorithm is good enough to have a unique hash for every item, or else it would take a long time to calculate. When a collision does occur, python falls back to checking the equality of the values with __eq__
to make sure they are not the same.
class Foo(object):
def __hash__(self):
return 7
def __eq__(self, other):
return True
>>> x = set()
>>> x.add(Foo())
>>> assert len(x) == 1
>>> x.add(Foo())
>>> assert len(x) == 1
>>>
This is a reason why you see frightening runtimes over here but note that you can expect O(1) amortized membership checks in sets, even though they have O(N) worst case (everything is a hash collision). The worst case is very very very unlikely to occur due to Python's smart implementation.
Upvotes: 7