jlconlin
jlconlin

Reputation: 15064

How to implement an ordered, default dict?

I would like to combine OrderedDict() and defaultdict() from collections in one object, which shall be an ordered, default dict.
Is this possible?

Upvotes: 227

Views: 82705

Answers (11)

user3064538
user3064538

Reputation:

defaultdict is ordered by insertion order on Python 3.7+ (and CPython 3.6+).

Upvotes: 1

Nikolay Prokopyev
Nikolay Prokopyev

Reputation: 1312

I created slightly fixed and more simplified version of the accepted answer, actual for python 3.7.

from collections import OrderedDict
from copy import copy, deepcopy
import pickle
from typing import Any, Callable


class DefaultOrderedDict(OrderedDict):
    def __init__(
            self,
            default_factory: Callable[[], Any],
            *args,
            **kwargs,
    ):
        super().__init__(*args, **kwargs)
        self.default_factory = default_factory

    def __getitem__(self, key):
        try:
            return super().__getitem__(key)
        except KeyError:
            return self.__missing__(key)

    def __missing__(self, key):
        self[key] = value = self.default_factory()
        return value

    def __reduce__(self):
        return type(self), (self.default_factory, ), None, None, iter(self.items())

    def copy(self):
        return self.__copy__()

    def __copy__(self):
        return type(self)(self.default_factory, self)

    def __deepcopy__(self, memo):
        return type(self)(self.default_factory, deepcopy(tuple(self.items()), memo))

    def __repr__(self):
        return f'{self.__class__.__name__}({self.default_factory}, {OrderedDict(self).__repr__()})'

And, that may be even more important, provided some tests.

a = DefaultOrderedDict(list)

# testing default
assert a['key'] == []
a['key'].append(1)
assert a['key'] == [1, ]

# testing repr
assert repr(a) == "DefaultOrderedDict(<class 'list'>, OrderedDict([('key', [1])]))"

# testing copy
b = a.copy()
assert b['key'] is a['key']
c = copy(a)
assert c['key'] is a['key']
d = deepcopy(a)
assert d['key'] is not a['key']
assert d['key'] == a['key']

# testing pickle
saved = pickle.dumps(a)
restored = pickle.loads(saved)
assert restored is not a
assert restored == a

# testing order
a['second_key'] = [2, ]
a['key'] = [3, ]
assert list(a.items()) == [('key', [3, ]), ('second_key', [2, ])]

Upvotes: 0

Samarth
Samarth

Reputation: 139

Inspired by other answers on this thread, you can use something like,

from collections import OrderedDict

class OrderedDefaultDict(OrderedDict):
    def __missing__(self, key):
        value = OrderedDefaultDict()
        self[key] = value
        return value

I would like to know if there're any downsides of initializing another object of the same class in the missing method.

Upvotes: -2

snebel29
snebel29

Reputation: 199

Another simple approach would be to use dictionary get method

>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> d['key'] = d.get('key', 0) + 1
>>> d['key'] = d.get('key', 0) + 1
>>> d
OrderedDict([('key', 2)])
>>> 

Upvotes: 13

F Pereira
F Pereira

Reputation: 1367

A simple and elegant solution building on @NickBread. Has a slightly different API to set the factory, but good defaults are always nice to have.

class OrderedDefaultDict(OrderedDict):
    factory = list

    def __missing__(self, key):
        self[key] = value = self.factory()
        return value

Upvotes: 8

Taylor D. Edmiston
Taylor D. Edmiston

Reputation: 13036

Here's another solution to think about if your use case is simple like mine and you don't necessarily want to add the complexity of a DefaultOrderedDict class implementation to your code.

from collections import OrderedDict

keys = ['a', 'b', 'c']
items = [(key, None) for key in keys]
od = OrderedDict(items)

(None is my desired default value.)

Note that this solution won't work if one of your requirements is to dynamically insert new keys with the default value. A tradeoff of simplicity.

Update 3/13/17 - I learned of a convenience function for this use case. Same as above but you can omit the line items = ... and just:

od = OrderedDict.fromkeys(keys)

Output:

OrderedDict([('a', None), ('b', None), ('c', None)])

And if your keys are single characters, you can just pass one string:

OrderedDict.fromkeys('abc')

This has the same output as the two examples above.

You can also pass a default value as the second arg to OrderedDict.fromkeys(...).

Upvotes: 30

Ortal Turgeman
Ortal Turgeman

Reputation: 141

i tested the default dict and discovered it's also sorted! maybe it was just a coincidence but anyway you can use the sorted function:

sorted(s.items())

i think it's simpler

Upvotes: -4

Artyer
Artyer

Reputation: 40891

If you want a simple solution that doesn't require a class, you can just use OrderedDict.setdefault(key, default=None) or OrderedDict.get(key, default=None). If you only get / set from a few places, say in a loop, you can easily just setdefault.

totals = collections.OrderedDict()

for i, x in some_generator():
    totals[i] = totals.get(i, 0) + x

It is even easier for lists with setdefault:

agglomerate = collections.OrderedDict()

for i, x in some_generator():
    agglomerate.setdefault(i, []).append(x)

But if you use it more than a few times, it is probably better to set up a class, like in the other answers.

Upvotes: 44

Neck Beard
Neck Beard

Reputation: 433

A simpler version of @zeekay 's answer is:

from collections import OrderedDict

class OrderedDefaultListDict(OrderedDict): #name according to default
    def __missing__(self, key):
        self[key] = value = [] #change to whatever default you want
        return value

Upvotes: 8

avyfain
avyfain

Reputation: 854

Here is another possibility, inspired by Raymond Hettinger's super() Considered Super, tested on Python 2.7.X and 3.4.X:

from collections import OrderedDict, defaultdict

class OrderedDefaultDict(OrderedDict, defaultdict):
    def __init__(self, default_factory=None, *args, **kwargs):
        #in python3 you can omit the args to super
        super(OrderedDefaultDict, self).__init__(*args, **kwargs)
        self.default_factory = default_factory

If you check out the class's MRO (aka, help(OrderedDefaultDict)), you'll see this:

class OrderedDefaultDict(collections.OrderedDict, collections.defaultdict)
 |  Method resolution order:
 |      OrderedDefaultDict
 |      collections.OrderedDict
 |      collections.defaultdict
 |      __builtin__.dict
 |      __builtin__.object

meaning that when an instance of OrderedDefaultDict is initialized, it defers to the OrderedDict's init, but this one in turn will call the defaultdict's methods before calling __builtin__.dict, which is precisely what we want.

Upvotes: 50

Zach Kelling
Zach Kelling

Reputation: 53839

The following (using a modified version of this recipe) works for me:

from collections import OrderedDict, Callable

class DefaultOrderedDict(OrderedDict):
    # Source: http://stackoverflow.com/a/6190500/562769
    def __init__(self, default_factory=None, *a, **kw):
        if (default_factory is not None and
           not isinstance(default_factory, Callable)):
            raise TypeError('first argument must be callable')
        OrderedDict.__init__(self, *a, **kw)
        self.default_factory = default_factory

    def __getitem__(self, key):
        try:
            return OrderedDict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key)
        self[key] = value = self.default_factory()
        return value

    def __reduce__(self):
        if self.default_factory is None:
            args = tuple()
        else:
            args = self.default_factory,
        return type(self), args, None, None, self.items()

    def copy(self):
        return self.__copy__()

    def __copy__(self):
        return type(self)(self.default_factory, self)

    def __deepcopy__(self, memo):
        import copy
        return type(self)(self.default_factory,
                          copy.deepcopy(self.items()))

    def __repr__(self):
        return 'OrderedDefaultDict(%s, %s)' % (self.default_factory,
                                               OrderedDict.__repr__(self))

Upvotes: 103

Related Questions