ostrokach
ostrokach

Reputation: 19932

Write a Class which returns different values when called as list(c) and dict(c)

I am trying to implement a custom class which returns a different value when called as list(c) or dict(c). However, it is my impression that both list(c) and dict(c) use c.__iter__() under the hood? If that is the case, how can I get different behaviour calling list(c) and dict(c)? I know that it is possible because Python dictionaries and pandas DataFrames have different hevariours.

For example:

class Foo:
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        for key, value in zip(self._keys, self._data):
            yield key, value

Calling dict(c) I get what I want:

>>> f = Foo()
>>> dict(f)
{'a': 10, 'b': 20, 'd': 40, 'e': 50}

However, I can't get list(c) to print out a list of keys (or values), but instead get both:

>>> f = Foo()
>>> list(f)
[('a', 10), ('b', 20), ('d', 30), ('d', 40), ('e', 50)]

The equivalent code for a dictionary is much cleaner:

>>> f = {'a': 10, 'b': 20, 'c': 30, 'd': 40, 'e': 50}
>>> dict(f)
{'a': 10, 'b': 20, 'c': 30, 'd': 40, 'e': 50}
>>> list(f)
['a', 'b', 'c', 'd', 'e']

Upvotes: 4

Views: 233

Answers (2)

Obviously the __iter__ must only return the keys, otherwise list(f) wouldn't work.

The Python documentation says the following of the dict constructor:

If a positional argument is given and it is a mapping object, a dictionary is created with the same key-value pairs as the mapping object.

Now, the question is what is a "mapping" enough for the dict constructor? DataFrame doesn't inherit from any mapping class, neither is it registered against an abstract base class. It turns out we only need to support the keys method: If the object passed to dict constructor has a method called keys, this is called to provide an iterable of the keys [CPython source]. For each key, the value is fetched by indexing.

I.e. the dict constructor does the logical equivalent of the following:

if hasattr(source, 'keys'):
    for k in source.keys():
        self[k] = source[k]
else:
    self.update(iter(source))

Using this we get

class Foo:
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        return iter(self.keys)

    def __getitem__(self, key):
        idx = self._keys.index(key)
        return self._data[idx]

    def keys(self):
        return self._keys

Testing:

>>> f = Foo()
>>> list(f)
['a', 'b', 'd', 'd', 'e']

>>> dict(f)
{'d': 30, 'e': 50, 'a': 10, 'b': 20}

(As you can see from the code above, there is no need to actually inherit from anything)

However, it is not guaranteed that all mapping constructors behave in the same way - some other might call items - thus the most compatible way would be to implement all of the methods required by collections.abc.Mapping and inherit from it. I.e. it would be enough to do

class Foo(collections.abc.Mapping):
    ...
    def __getitem__(self, key):
        idx = self._keys.index(key)
        return self._data[idx]

    def __iter__(self):
        return iter(self._keys)

    def __len__(self):
        return len(self._keys)

Upvotes: 6

ostrokach
ostrokach

Reputation: 19932

@mgilson's comment is correct, this can be accomplished by inheriting from the collections.abc.Mapping class:

class Foo(collections.abc.Mapping):
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        for key in self._keys:
            yield key

    def __getitem__(self, value):
        return self._data[self._keys.index(value)]

    def __len__(self):
        return len(self._keys)
>>> f = Foo()
>>> list(f)
['a', 'b', 'd', 'd', 'e']

>>> dict(f)
{'a': 10, 'b': 20, 'd': 30, 'e': 50}

Upvotes: 2

Related Questions