Jonathan Livni
Jonathan Livni

Reputation: 107082

defaultdict of defaultdict?

Is there a way to have a defaultdict(defaultdict(int)) in order to make the following code work?

for x in stuff:
    d[x.a][x.b] += x.c_int

d needs to be built ad-hoc, depending on x.a and x.b elements.

I could use:

for x in stuff:
    d[x.a,x.b] += x.c_int

but then I wouldn't be able to use:

d.keys()
d[x.a].keys()

Upvotes: 500

Views: 173808

Answers (8)

Q. Qiao
Q. Qiao

Reputation: 827

import collections

d = collections.defaultdict(collections.Counter)

for x in stuff:
    d[x.a][x.b] += x.c_int

Upvotes: 0

ShadowRanger
ShadowRanger

Reputation: 155333

defaultdict(lambda: defaultdict(int)) has a flaw, which is that it isn't pickle friendly, thanks to the lambda. While you could define the default function globally, e.g.:

def make_defaultdict_int():
    return defaultdict(int)
dd = defaultdict(make_defaultdict_int)

to work around this, that's rather verbose. Luckily, it's pretty easy to make this work in a pickle-friendly way without that:

dd = defaultdict(defaultdict(int).copy)

That makes a template empty defaultdict(int), and passes a bound copy method from it as the factory function. Because defaultdict and int are pickleable, as are all bound methods of pickleable objects, that renders the structure fully pickleable without any custom definitions or additional imports. On some versions of Python, it's more performant than the equivalent lambda (depending on where the recent optimization efforts have been centered), but even when it isn't, the performance is comparable, and it's no more verbose, so it's my preferred approach even when pickling isn't a concern, simply because it means I don't need to change approaches if/when pickling becomes important.

Upvotes: 3

Clément
Clément

Reputation: 12897

Previous answers have addressed how to make a two-levels or n-levels defaultdict. In some cases you want an infinite one:

def ddict():
    return defaultdict(ddict)

Usage:

>>> d = ddict()
>>> d[1]['a'][True] = 0.5
>>> d[1]['b'] = 3
>>> import pprint; pprint.pprint(d)
defaultdict(<function ddict at 0x7fcac68bf048>,
            {1: defaultdict(<function ddict at 0x7fcac68bf048>,
                            {'a': defaultdict(<function ddict at 0x7fcac68bf048>,
                                              {True: 0.5}),
                             'b': 3})})

Upvotes: 52

Campi
Campi

Reputation: 2307

For reference, it's possible to implement a generic nested defaultdict factory method through:

from collections import defaultdict
from functools import partial
from itertools import repeat


def nested_defaultdict(default_factory, depth=1):
    result = partial(defaultdict, default_factory)
    for _ in repeat(None, depth - 1):
        result = partial(defaultdict, result)
    return result()

The depth defines the number of nested dictionary before the type defined in default_factory is used. For example:

my_dict = nested_defaultdict(list, 3)
my_dict['a']['b']['c'].append('e')

Upvotes: 20

mouad
mouad

Reputation: 70021

Yes like this:

defaultdict(lambda: defaultdict(int))

The argument of a defaultdict (in this case is lambda: defaultdict(int)) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist] will be defaultdict(int).

If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist] it will return 0, which is the return value of the argument of the last defaultdict i.e. int().

Upvotes: 883

yanjost
yanjost

Reputation: 5441

The parameter to the defaultdict constructor is the function which will be called for building new elements. So let's use a lambda !

>>> from collections import defaultdict
>>> d = defaultdict(lambda : defaultdict(int))
>>> print d[0]
defaultdict(<type 'int'>, {})
>>> print d[0]["x"]
0

Since Python 2.7, there's an even better solution using Counter:

>>> from collections import Counter
>>> c = Counter()
>>> c["goodbye"]+=1
>>> c["and thank you"]=42
>>> c["for the fish"]-=5
>>> c
Counter({'and thank you': 42, 'goodbye': 1, 'for the fish': -5})

Some bonus features

>>> c.most_common()[:2]
[('and thank you', 42), ('goodbye', 1)]

For more information see PyMOTW - Collections - Container data types and Python Documentation - collections

Upvotes: 59

Steven Rumbalski
Steven Rumbalski

Reputation: 45542

Others have answered correctly your question of how to get the following to work:

for x in stuff:
    d[x.a][x.b] += x.c_int

An alternative would be to use tuples for keys:

d = defaultdict(int)
for x in stuff:
    d[x.a,x.b] += x.c_int
    # ^^^^^^^ tuple key

The nice thing about this approach is that it is simple and can be easily expanded. If you need a mapping three levels deep, just use a three item tuple for the key.

Upvotes: 8

Katriel
Katriel

Reputation: 123622

I find it slightly more elegant to use partial:

import functools
dd_int = functools.partial(defaultdict, int)
defaultdict(dd_int)

Of course, this is the same as a lambda.

Upvotes: 37

Related Questions