Reputation: 23
I wrote this simple code and I was trying to understand what is going on exactly. I created to equal objects and put only one of them in a dictionary.
Then, using the second object as a key, I try to print the name attribute of its value.
Thanks to my hash function, the dictionary returns the value of the hash corresponding to the key I inserted, which is the same for obj1 and obj2.
Here is my question: does my hash function check that the two objects are indeed equal or that is it a case of collision?
I hope the question is clear.
class Test:
def __init__(self, name):
self.name = name
def __eq__(self, other):
return (isinstance(other, type(self)) and self.name == other.name)
def __hash__(self):
return hash(self.name)
obj1 = Test('abc')
obj2 = Test('abc')
d = {}
d[obj1] = obj1
print(d[obj2].name)
Upvotes: 2
Views: 948
Reputation: 387745
You can easily figure this out by testing a few combinations. Consider these two types:
class AlwaysEqualConstantHash:
def __eq__(self, other):
print('AlwaysEqualConstantHash eq')
return True
def __hash__(self):
print('AlwaysEqualConstantHash hash')
return 4
class NeverEqualConstantHash:
def __eq__(self, other):
print('NeverEqualConstantHash eq')
return False
def __hash__(self):
print('NeverEqualConstantHash hash')
return 4
Now let’s put this inside a dictionary and see what happens:
>>> d = {}
>>> d[AlwaysEqualConstantHash()] = 'a'
AlwaysEqualConstantHash hash
>>> d[AlwaysEqualConstantHash()]
AlwaysEqualConstantHash hash
AlwaysEqualConstantHash eq
'a'
>>> d[AlwaysEqualConstantHash()] = 'b'
AlwaysEqualConstantHash hash
AlwaysEqualConstantHash eq
>>> d
{<__main__.AlwaysEqualConstantHash object at 0x00000083E8174A90>: 'b'}
As you can see, the hash is used all the time to address the element in the dictionary. And as soon as there is an element with the same hash inside the dictionary, the equality comparison is also made to figure whether the existing element is equal to the new one. So since all our new AlwaysEqualConstantHash
objects are equal to another, they all can be used as the same key in the dictionary.
>>> d = {}
>>> d[NeverEqualConstantHash()] = 'a'
NeverEqualConstantHash hash
>>> d[NeverEqualConstantHash()]
NeverEqualConstantHash hash
NeverEqualConstantHash eq
Traceback (most recent call last):
File "<pyshell#56>", line 1, in <module>
d[NeverEqualConstantHash()]
KeyError: <__main__.NeverEqualConstantHash object at 0x00000083E8186BA8>
>>> d[NeverEqualConstantHash()] = 'b'
NeverEqualConstantHash hash
NeverEqualConstantHash eq
>>> d
{<__main__.NeverEqualConstantHash object at 0x00000083E8186F60>: 'a', <__main__.NeverEqualConstantHash object at 0x00000083E8186FD0>: 'b'}
For the NeverEqualConstantHash
this is very different. The hash is also used all the time but since a new object is never equal to another, we cannot retrieve the existing objects that way.
>>> x = NeverEqualConstantHash()
>>> d[x] = 'foo'
NeverEqualConstantHash hash
NeverEqualConstantHash eq
NeverEqualConstantHash eq
>>> d[x]
NeverEqualConstantHash hash
NeverEqualConstantHash eq
NeverEqualConstantHash eq
'foo'
If we use the exact same key though, we can still retrieve the element since it won’t need to compare to itself using __eq__
. We also see how the __eq__
is being called for every existing element with the same hash in order to check whether this new object is equal or not to another.
So yeah, the hash is being used to quickly sort the element into the dictionary. And the hash must be equal for elements that are considered equal. Only for hash collisions with existing elements the __eq__
is being used to make sure that both objects refer to the same element.
Upvotes: 2