user541686
user541686

Reputation: 210525

Keyed Collection in Python?

Is there any equivalent to KeyedCollection in Python, i.e. a set where the elements have (or dynamically generate) their own keys?

i.e. the goal here is to avoid storing the key in two places, and therefore dictionaries are less than ideal (hence the question).

Upvotes: 8

Views: 1071

Answers (8)

Fabien Letort
Fabien Letort

Reputation: 31

To go a little more in detail that the already correct answer from @Gabi Purcaru's answer, here a class that do the same as gabi one's but that also check for correct given type on key and value (as the TKey and TValue of the .net KeyedCollection).

class KeyedCollection(MutableMapping):
    """
    Provides the abstract base class for a collection (:class:`MutableMappinp`) whose keys are embedded in the values.
    """
    __metaclass__ = abc.ABCMeta
    _dict = None  # type: dict

    def __init__(self, seq={}):
        self._dict = dict(seq)

    @abc.abstractmethod
    def __is_type_key_correct__(self, key):
        """
        Returns: The type of keys in the collection
        """
        pass

    @abc.abstractmethod
    def __is_type_value_correct__(self, value):
        """
        Returns: The type of values in the collection
        """
        pass

    @abc.abstractmethod
    def get_key_for_item(self, value):
        """
        When implemented in a derivated class, extracts the key from the specified element.
        Args:
            value: the element from which to extract the key (of type specified by :meth:`type_value`)

        Returns: The key of specified element (of type specified by :meth:`type_key`)
        """
        pass

    def __assert_type_key(self, key, arg_name='key'):
        if not self.__is_type_key_correct__(key) :
            raise ValueError("{} type is not correct".format(arg_name))

    def __assert_type_value(self, value, arg_name='value'):
        if not self.__is_type_value_correct__(value) :
            raise ValueError("{} type is not correct".format(arg_name))

    def add(self, value):
        """
        Adds an object to the KeyedCollection.
        Args:
            value: The object to be added to the KeyedCollection (of type specified by :meth:`type_value`).
        """
        key = self.get_key_for_item(value)
        self._dict[key] = value

    # Implements abstract method __setitem__ from MutableMapping parent class
    def __setitem__(self, key, value):
        self.__assert_type_key(key)
        self.__assert_type_value(value)
        if value.get_key() != key:
            raise ValueError("provided key does not correspond to the given KeyedObject value")
        self._dict[key] = value

    # Implements abstract method __delitem__ from MutableMapping parent class
    def __delitem__(self, key):
        self.__assert_type_key(key)
        self._dict.pop(key)

    # Implements abstract method __getitem__ from MutableMapping parent class (Mapping base class)
    def __getitem__(self, key):
        self.__assert_type_key(key)
        return self._dict[key]

    # Implements abstract method __len__ from MutableMapping parent class (Sized mixin on Mapping base class)
    def __len__(self):
        return len(self._dict)

    # Implements abstract method __iter__ from MutableMapping parent class (Iterable mixin on Mapping base class)
    def __iter__(self):
        return iter(self._dict)
        pass

    # Implements abstract method __contains__ from MutableMapping parent class (Container mixin on Mapping base class)
    def __contains__(self, x):
        self.__assert_type_key(x, 'x')
        return x in self._dict

Upvotes: 0

Zzz
Zzz

Reputation: 1

How about a set()? The elements can have their own k

Upvotes: 0

kindall
kindall

Reputation: 184211

Given your constraints, everyone trying to implement what you're looking for using a dict is barking up the wrong tree. Instead, you should write a list subclass that overrides __getitem__ to provide the behavior you want. I've written it so it tries to get the desired item by index first, then falls back to searching for the item by the key attribute of the contained objects. (This could be a property if the object needs to determine this dynamically.)

There's no way to avoid a linear search if you don't want to duplicate something somewhere; I am sure the C# implementation does exactly the same thing if you don't allow it to use a dictionary to store the keys.

class KeyedCollection(list):
     def __getitem__(self, key):
         if isinstance(key, int) or isinstance(key, slice):
             return list.__getitem__(key)
         for item in self:
             if getattr(item, "key", 0) == key:
                 return item
         raise KeyError('item with key `%s` not found' % key)

You would probably also want to override __contains__ in a similar manner so you could say if "key" in kc.... If you want to make it even more like a dict, you could also implement keys() and so on. They will be equally inefficient, but you will have an API like a dict, that also works like a list.

Upvotes: 2

Ned Batchelder
Ned Batchelder

Reputation: 375624

@Mehrdad said:

Because semantically, it doesn't make as much sense. When an object knows its key, it doesn't make sense to put it in a dictionary -- it's not a key-value pair. It's more of a semantic issue than anything else.

With this constraint, there is nothing in Python that does what you want. I suggest you use a dict and not worry about this level of detail on the semantics. @Gabi Purcaru's answer shows how you can create an object with the interface you want. Why get bothered about how it's working internally?

It could be that C#'s KeyedCollection is doing the same thing under the covers: asking the object for its key and then storing the key for fast access. In fact, from the docs:

By default, the KeyedCollection(Of TKey, TItem) includes a lookup dictionary that you can obtain with the Dictionary property. When an item is added to the KeyedCollection(Of TKey, TItem), the item's key is extracted once and saved in the lookup dictionary for faster searches. This behavior is overridden by specifying a dictionary creation threshold when you create the KeyedCollection(Of TKey, TItem). The lookup dictionary is created the first time the number of elements exceeds that threshold. If you specify –1 as the threshold, the lookup dictionary is never created.

Upvotes: 1

sampwing
sampwing

Reputation: 1268

I'm not sure if this is what you meant, but this dictionary will create it's own keys as you add to it...

class KeyedCollection(dict):
    def __init__(self):
        self.current_key = 0
    def add(self, item):
        self[self.current_key] = item

abc = KeyedCollection()
abc.add('bob')
abc.add('jane')
>>> abc
{0: 'bob', 1: 'jane'}

Upvotes: 0

steveha
steveha

Reputation: 76715

Why not simply use a dict? If the key already exists, a reference to the key will be used in the dict; it won't be senselessly duplicated.

class MyExample(object):
    def __init__(self, key, value):
        self.key = key
        self.value = value

m = MyExample("foo", "bar")
d = {}

d[m.key] = m

first_key = d.keys()[0]
first_key is m.key  # returns True

If the key doesn't already exist, a copy of it will be saved, but I don't see that as a problem.

def lame_hash(s):
    h = 0
    for ch in s:
        h ^= ord(ch)
    return h

d = {}
d[lame_hash(m.key)] = m
print d  # key value is 102 which is stored in the dict

lame_hash(m.key) in d  # returns True

Upvotes: 0

Gabi Purcaru
Gabi Purcaru

Reputation: 31524

You can simulate that very easily:

class KeyedObject(object):
    def get_key(self):
        raise NotImplementedError("You must subclass this before you can use it.")

class KeyedDict(dict):
    def append(self, obj):
        self[obj.get_key()] = obj

Now you can use a KeyedDict instead of dict with subclasses of KeyedObject (where get_key return a valid key based on some object property).

Upvotes: 4

Paul
Paul

Reputation: 21975

I'm not much of a C#'er, but I think dictionaries is what you need.

http://docs.python.org/tutorial/datastructures.html#dictionaries

http://docs.python.org/tutorial/datastructures.html

Or maybe lists:

http://docs.python.org/library/functions.html#list

Upvotes: 1

Related Questions