Jared Mackey
Jared Mackey

Reputation: 4158

Is there a comparison key for set objects?

Is there a way to give a comparator to set() so when adding items it checks an attribute of that item for likeness rather than if the item is the same? For example, I want to use objects in a set that can contain the same value for one attribute.

class TestObj(object):
    def __init__(self, value, *args, **kwargs):
        self.value = value 
        super().__init__(*args, **kwargs)

values = set()
a = TestObj('a')
b = TestObj('b')
a2 = TestObj('a')
values.add(a) # Ok
values.add(b) # Ok
values.add(a2) # Not ok but still gets added

# Hypothetical code
values = set(lambda x, y: x.value != y.value)
values.add(a) # Ok
values.add(b) # Ok
values.add(a2) # Not added

I have implemented my own sorta thing that does similar functionality but wanted to know if there was a builtin way.

from Queue import Queue
class UniqueByAttrQueue(Queue):
    def __init__(self, attr, *args, **kwargs):
        Queue.__init__(self, *args, **kwargs)
        self.attr = attr

    def _init(self, maxsize):
        self.queue = set()

    def _put(self, item):
        # Potential race condition, worst case message gets put in twice
        if hasattr(item, self.attr) and item not in self:
            self.queue.add(item)

    def __contains__(self, item):
        item_attr = getattr(item, self.attr)
        for x in self.queue:
            x_attr = getattr(x, self.attr)
            if x_attr == item_attr:
                return True
        return False

    def _get(self):
        return self.queue.pop()

Upvotes: 3

Views: 885

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155448

Just define __hash__ and __eq__ on the object in terms of the attribute in question and it will work with sets. For example:

class TestObj(object):
    def __init__(self, value, *args, **kwargs):
        self.value = value 
        super().__init__(*args, **kwargs)

    def __eq__(self, other):
        if not instance(other, TestObj):
            return NotImplemented
        return self.value == other.value

    def __hash__(self):
        return hash(self.value)

If you can't change the object (or don't want to, say, because other things are important to equality), then use a dict instead. You can either do:

mydict[obj.value] = obj

so new objects replace old, or

mydict.setdefault(obj.value, obj)

so old objects are maintained if the value in question is already in the keys. Just make sure to iterate using .viewvalues() (Python 2) or .values() (Python 3) instead of iterating directly (which would get the keys, not the values). You could actually use this approach to make a custom set-like object with a key as you describe (though you'd need to implement many more methods than I show to make it efficient, the default methods are usually fairly slow):

from collections.abc import MutableSet  # On Py2, collections without .abc

class keyedset(MutableSet):
    def __init__(self, it=(), key=lambda x: x):
        self.key = key
        self.contents = {}
        for x in it:
            self.add(x)

    def __contains__(self, x):
        # Use anonymous object() as default so all arguments handled properly
        sentinel = object()
        getval = self.contents.get(self.key(x), sentinel)
        return getval is not sentinel and getval == x

    def __iter__(self):
        return iter(self.contents.values())  # itervalues or viewvalues on Py2

    def __len__(self):
        return len(self.contents)

    def add(self, x):
        self.contents.setdefault(self.key(x), x)

    def discard(self, x):
        self.contents.pop(self.key(x), None)

Upvotes: 5

Related Questions