localusername
localusername

Reputation: 45

Class Custom __eq__ as Comparison of Hashes

Consider a custom class:

class MyObject:
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __hash__(self):
        return hash((self.a, self.b))

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.__hash__() == other.__hash__()

Is it a bad idea to make equality reliant upon the hash? This seems like a much more elegant and readable mechanism than checking each pair of attributes in a piecemeal fashion for larger numbers of attributes ala

self.a == other.a and self.b == other.b and ... self.n == other.n

or a more dynamic check using getattr and a list (is there a better way to compare large numbers of pairs of attributes?)

Is the size of the hash returned by the builtin hash function not large enough to be reliable in relatively large sets of data?

Upvotes: 2

Views: 252

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122242

Yes, this is a bad idea. Hashes are not unique, objects with equal hashes are not guaranteed to actually be equal too:

>>> (-1, 0) == (-2, 0)
False
>>> hash((-1, 0)) == hash((-2, 0))
True

Hashes are not meant to be unique; they are a means to pick a slot in a limited-size hash table quickly, to facilitate O(1) dictionary look-ups, and collisions are allowed and expected.

Yes, Python requires that equal objects should have equal hashes, but that doesn't mean the relationship can be reversed.

I just compare tuples:

def __eq__(self, other):
    return (self.a, self.b) == (other.a, other.b)

If you are writing a lot of data classes, simple classes that all need equality testing and hashing, etc. use the dataclasses module (Python 3.7 or up, or use a backport):

from dataclasses import dataclass

@dataclass(frozen=True)
class MyObject:
    a: int
    b: int

The above class now comes with a __hash__ and __equals__ method:

>>> MyObject(-1, 0) == MyObject(-2, 0)
False
>>> hash(MyObject(-1, 0)) == hash(MyObject(-2, 0))
True
>>> MyObject(42, 12345) == MyObject(42, 12345)
True

Upvotes: 3

Related Questions