Reputation: 45
Consider a custom class:
class MyObject:
def __init__(self, a, b):
self.a = a
self.b = b
def __hash__(self):
return hash((self.a, self.b))
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.__hash__() == other.__hash__()
Is it a bad idea to make equality reliant upon the hash? This seems like a much more elegant and readable mechanism than checking each pair of attributes in a piecemeal fashion for larger numbers of attributes ala
self.a == other.a and self.b == other.b and ... self.n == other.n
or a more dynamic check using getattr and a list (is there a better way to compare large numbers of pairs of attributes?)
Is the size of the hash returned by the builtin hash function not large enough to be reliable in relatively large sets of data?
Upvotes: 2
Views: 252
Reputation: 1122242
Yes, this is a bad idea. Hashes are not unique, objects with equal hashes are not guaranteed to actually be equal too:
>>> (-1, 0) == (-2, 0)
False
>>> hash((-1, 0)) == hash((-2, 0))
True
Hashes are not meant to be unique; they are a means to pick a slot in a limited-size hash table quickly, to facilitate O(1) dictionary look-ups, and collisions are allowed and expected.
Yes, Python requires that equal objects should have equal hashes, but that doesn't mean the relationship can be reversed.
I just compare tuples:
def __eq__(self, other):
return (self.a, self.b) == (other.a, other.b)
If you are writing a lot of data classes, simple classes that all need equality testing and hashing, etc. use the dataclasses
module (Python 3.7 or up, or use a backport):
from dataclasses import dataclass
@dataclass(frozen=True)
class MyObject:
a: int
b: int
The above class now comes with a __hash__
and __equals__
method:
>>> MyObject(-1, 0) == MyObject(-2, 0)
False
>>> hash(MyObject(-1, 0)) == hash(MyObject(-2, 0))
True
>>> MyObject(42, 12345) == MyObject(42, 12345)
True
Upvotes: 3