Ed Danileyko
Ed Danileyko

Reputation: 239

Using an object's id() as a hash value

is it a bad idea to implement __hash__ like so?

class XYZ:
    def __init__(self):
        self.val = None

    def __hash__(self):
        return id(self)

Am i setting up something potentially disastrous?

Upvotes: 6

Views: 3181

Answers (3)

Dunes
Dunes

Reputation: 40703

This is the default implementation of __hash__. Be aware, that implementing __eq__ causes the default __hash__ implementation to disappear. Should you reimplement __hash__ then any objects that compare equal must have an equal hash.

It is okay for non-equal objects to have the same hash value though. Therefore, having a hash implementation that returns a constant value is always safe. However, it is very inefficient.

A good default that works for a lot of use cases is to return a hash of the tuple of the attributes that are used in the __eq__ method. eg.

class XYZ:
    def __init__(self, val0, val1):
        self.val0 = val0
        self.val1 = val1

    def __eq__(self, other):
        return self.val0 == other.val1 and self.val1 == other.val1

    def __hash__(self):
        return hash((self.val0, self.val1))

Upvotes: 1

user2357112
user2357112

Reputation: 280311

It's either pointless or wrong, depending on the rest of the class.

If your objects use the default identity-based ==, then defining this __hash__ is pointless. The default __hash__ is also identity-based, but faster, and tweaked to avoid always having the low bits set to 0. Using the default __hash__ would be simpler and more efficient.

If you objects don't use the default identity-based ==, then your __hash__ is wrong, because it's going to be inconsistent with ==. If your objects are immutable, you should implement __hash__ in a way that would be consistent with ==; if your objects are mutable, you should not implement __hash__ at all (and set __hash__ = None if you need to support Python 2).

Upvotes: 3

Bakuriu
Bakuriu

Reputation: 101919

The __hash__ method has to satisfy the following requirement in order to work:

Forall x, y such that x == y, then hash(x) == hash(y).

In your case your class does not implement __eq__ which means that x == y if and only if id(x) == id(y), and thus your hash implementation satisfy the above property.

Note however that if you do implement __eq__ then this implementation will likely fail.

Also: there is a difference between having a "valid" __hash__ and having a good hash. For example the following is a valid __hash__ definition for any class:

def __hash__(self):
    return 1

A good hash should try to distribute uniformly the objects as to avoid collisions as much as possible. Usually this requires a more complex definition. I'd avoid trying to come up with formulas and instead rely on python built-in hash function.

For example if your class has fields a, b and c then I'd use something like this as __hash__:

def __hash__(self):
    return hash((self.a, self.b, self.c))

The definition of hash for tuples should be good enough for the average case.

Finally: you should not define __hash__ in classes that are mutable (in the fields used for equality). That's because modifying the instances will change their hash and this will break things.

Upvotes: 6

Related Questions