Reputation: 239
is it a bad idea to implement __hash__
like so?
class XYZ:
def __init__(self):
self.val = None
def __hash__(self):
return id(self)
Am i setting up something potentially disastrous?
Upvotes: 6
Views: 3181
Reputation: 40703
This is the default implementation of __hash__
. Be aware, that implementing __eq__
causes the default __hash__
implementation to disappear. Should you reimplement __hash__
then any objects that compare equal must have an equal hash.
It is okay for non-equal objects to have the same hash value though. Therefore, having a hash implementation that returns a constant value is always safe. However, it is very inefficient.
A good default that works for a lot of use cases is to return a hash of the tuple of the attributes that are used in the __eq__
method. eg.
class XYZ:
def __init__(self, val0, val1):
self.val0 = val0
self.val1 = val1
def __eq__(self, other):
return self.val0 == other.val1 and self.val1 == other.val1
def __hash__(self):
return hash((self.val0, self.val1))
Upvotes: 1
Reputation: 280311
It's either pointless or wrong, depending on the rest of the class.
If your objects use the default identity-based ==
, then defining this __hash__
is pointless. The default __hash__
is also identity-based, but faster, and tweaked to avoid always having the low bits set to 0. Using the default __hash__
would be simpler and more efficient.
If you objects don't use the default identity-based ==
, then your __hash__
is wrong, because it's going to be inconsistent with ==
. If your objects are immutable, you should implement __hash__
in a way that would be consistent with ==
; if your objects are mutable, you should not implement __hash__
at all (and set __hash__ = None
if you need to support Python 2).
Upvotes: 3
Reputation: 101919
The __hash__
method has to satisfy the following requirement in order to work:
Forall x, y such that x == y
, then hash(x) == hash(y)
.
In your case your class does not implement __eq__
which means that x == y
if and only if id(x) == id(y)
, and thus your hash implementation satisfy the above property.
Note however that if you do implement __eq__
then this implementation will likely fail.
Also: there is a difference between having a "valid" __hash__
and having a good hash. For example the following is a valid __hash__
definition for any class:
def __hash__(self):
return 1
A good hash should try to distribute uniformly the objects as to avoid collisions as much as possible. Usually this requires a more complex definition.
I'd avoid trying to come up with formulas and instead rely on python built-in hash
function.
For example if your class has fields a
, b
and c
then I'd use something like this as __hash__
:
def __hash__(self):
return hash((self.a, self.b, self.c))
The definition of hash
for tuples should be good enough for the average case.
Finally: you should not define __hash__
in classes that are mutable (in the fields used for equality). That's because modifying the instances will change their hash and this will break things.
Upvotes: 6