Reputation: 843
I'm trying to implement a set-based solution for a problem and have been running into some issues.
The problem is: I have 2 sets of Group
objects. These sets should be unique on email
(so we can check if an object from one set is in
the other set).
However, two Group
objects are not __eq__()
if they only have an email match (for example, one set may contain an updated Group
object that has a new description
). The goal is to have a set where I can perform set operations (intersection and difference) based only on the email
field... then check equality based on other fields (description
and name
)
class Group:
def __init__(self, name, email, description):
self.name = name
self.email = email
self.description = description
def __hash__(self):
return hash(self.email)
def __eq__(self, other):
return self.email == other.email
and self.description == other.description
and self.name == other.name
def __ne__(self, other):
return not self.__eq__(other)
def __str__(self):
return "Description: {0} Email: {1} Name: {2}".format(self.description, self.email, self.name)
So i'd expect all assert statements to pass here:
group_1 = Group('first test group', '[email protected]', 'example description')
group_2 = Group('second test group', '[email protected]', 'example description')
group_3 = Group('third group', '[email protected]', 'example description')
group_5 = Group('updated name', '[email protected]', 'example description')
group_set = set([group_1, group_2, group_3])
group_set_2 = set([group_3, group_5])
self.assertTrue(group_5 in group_set.intersection(group_set_2))
self.assertEqual(2, len(group_set))
self.assertTrue(group_5 in group_set)
Upvotes: 1
Views: 2748
Reputation: 104722
Python's set
type uses the equality test implemented by an object's __eq__
method to determine if an object is "the same" as another object in its contents. The __hash__
method only allows it to find other elements to compare against more efficiently. So, your hope of using a __hash__
method based on a different set of attribute than the __eq__
method will not work. Multiple unequal objects with the same __hash__
value can exist in the same set
(though the set
will be somewhat less efficient due to the hash collisions).
If you want a unique mapping from an email address to a Group
, I suggest using a dictionary where the keys are email addresses and the values are Group
objects. This will let you ensure the email addresses are unique, while also letting you compare Group
objects in whatever way is most appropriate.
To perform a union between two such dictionaries, use the update
method on a copy of one dictionary:
union = dict_1.copy()
union.update(dict_2)
For an intersection, use a dictionary comprehension:
intersection = {email: group for email, group in dict_2.iteritems() if email in dict_1}
Both of those operations will prefer the values from dict_2
over the values from dict_1
wherever the same email occurs as a key in both. If you want it to work the other way, just switch the dictionary names around.
Upvotes: 1