Dowwie
Dowwie

Reputation: 2043

Overriding __eq__ and __hash__ to compare a dict attribute of two instances

I'm struggling to understand how to correctly compare objects based on an underlying dict attribute that each instance possesses.

Since I'm overriding __eq__, do I need to override __hash__ as well? I haven't a firm grasp on when/where to do so and could really use some help.

I created a simple example below to illustrate the maximum recursion exception that I've run into. A RegionalCustomerCollection organizes account IDs by geographical region. RegionalCustomerCollection objects are said to be equal if the regions and their respective accountids are. Essentially, all items() should be equal in content.

from collections import defaultdict

class RegionalCustomerCollection(object):

    def __init__(self):
        self.region_accountids = defaultdict(set) 

    def get_region_accountid(self, region_name=None):
        return self.region_accountids.get(region_name, None)

    def set_region_accountid(self, region_name, accountid):
        self.region_accountids[region_name].add(accountid)

    def __eq__(self, other):
        if (other == self):
            return True

        if isinstance(other, RegionalCustomerCollection):
            return self.region_accountids == other.region_accountids

        return False 

    def __repr__(self):
        return ', '.join(["{0}: {1}".format(region, acctids) 
                          for region, acctids 
                          in self.region_accountids.items()])

Let's create two object instances and populate them with some sample data:

>>> a = RegionalCustomerCollection()
>>> b = RegionalCustomerCollection()
>>> a.set_region_accountid('northeast',1)
>>> a.set_region_accountid('northeast',2)
>>> a.set_region_accountid('northeast',3)
>>> a.set_region_accountid('southwest',4)
>>> a.set_region_accountid('southwest',5)
>>> b.set_region_accountid('northeast',1)
>>> b.set_region_accountid('northeast',2)
>>> b.set_region_accountid('northeast',3)
>>> b.set_region_accountid('southwest',4)
>>> b.set_region_accountid('southwest',5)

Now let's try to compare the two instances and generate the recursion exception:

>>> a == b
... 
RuntimeError: maximum recursion depth exceeded while calling a Python object

Upvotes: 3

Views: 1680

Answers (2)

Steve Zelaznik
Steve Zelaznik

Reputation: 616

Your object shouldn't return a hash because it's mutable. If you put this object into a dictionary or set and then change it afterward, you may never be able to find it again.

In order to make an object unhashable, you need to do the following:

class MyClass(object):
    __hash__ = None

This will ensure that the object is unhashable.

 [in] >>> m = MyClass()
 [in] >>> hash(m)
[out] >>> TypeError: unhashable type 'MyClass'

Does this answer your question? I'm suspecting not because you were explicitly looking for a hash function.

As far as the RuntimeError you're receiving, it's because of the following line:

    if self == other:
        return True

That gets you into an infinite recursion loop. Try the following instead:

    if self is other:
        return True

Upvotes: 3

Rodrigo
Rodrigo

Reputation: 81

You don't need to override __hash__ to compare two objects (you'll need to if you want custom hashing, i.e. to improve performance when inserting into sets or dictionaries).

Also, you have infinite recursion here:

    def __eq__(self, other):
        if (other == self):
            return True

        if isinstance(other, RegionalCustomerCollection):
            return self.region_accountids == other.region_accountids

        return False 

If both objects are of type RegionalCustomerCollection then you'll have infinite recursion since == calls __eq__.

Upvotes: 1

Related Questions