Reputation: 107082
Is there a way to have a defaultdict(defaultdict(int))
in order to make the following code work?
for x in stuff:
d[x.a][x.b] += x.c_int
d
needs to be built ad-hoc, depending on x.a
and x.b
elements.
I could use:
for x in stuff:
d[x.a,x.b] += x.c_int
but then I wouldn't be able to use:
d.keys()
d[x.a].keys()
Upvotes: 500
Views: 173808
Reputation: 827
import collections
d = collections.defaultdict(collections.Counter)
for x in stuff:
d[x.a][x.b] += x.c_int
Upvotes: 0
Reputation: 155333
defaultdict(lambda: defaultdict(int))
has a flaw, which is that it isn't pickle
friendly, thanks to the lambda
. While you could define the default function globally, e.g.:
def make_defaultdict_int():
return defaultdict(int)
dd = defaultdict(make_defaultdict_int)
to work around this, that's rather verbose. Luckily, it's pretty easy to make this work in a pickle
-friendly way without that:
dd = defaultdict(defaultdict(int).copy)
That makes a template empty defaultdict(int)
, and passes a bound copy
method from it as the factory function. Because defaultdict
and int
are pickleable, as are all bound methods of pickleable objects, that renders the structure fully pickleable without any custom definitions or additional imports. On some versions of Python, it's more performant than the equivalent lambda
(depending on where the recent optimization efforts have been centered), but even when it isn't, the performance is comparable, and it's no more verbose, so it's my preferred approach even when pickling isn't a concern, simply because it means I don't need to change approaches if/when pickling becomes important.
Upvotes: 3
Reputation: 12897
Previous answers have addressed how to make a two-levels or n-levels defaultdict
. In some cases you want an infinite one:
def ddict():
return defaultdict(ddict)
Usage:
>>> d = ddict()
>>> d[1]['a'][True] = 0.5
>>> d[1]['b'] = 3
>>> import pprint; pprint.pprint(d)
defaultdict(<function ddict at 0x7fcac68bf048>,
{1: defaultdict(<function ddict at 0x7fcac68bf048>,
{'a': defaultdict(<function ddict at 0x7fcac68bf048>,
{True: 0.5}),
'b': 3})})
Upvotes: 52
Reputation: 2307
For reference, it's possible to implement a generic nested defaultdict
factory method through:
from collections import defaultdict
from functools import partial
from itertools import repeat
def nested_defaultdict(default_factory, depth=1):
result = partial(defaultdict, default_factory)
for _ in repeat(None, depth - 1):
result = partial(defaultdict, result)
return result()
The depth defines the number of nested dictionary before the type defined in default_factory
is used.
For example:
my_dict = nested_defaultdict(list, 3)
my_dict['a']['b']['c'].append('e')
Upvotes: 20
Reputation: 70021
Yes like this:
defaultdict(lambda: defaultdict(int))
The argument of a defaultdict
(in this case is lambda: defaultdict(int)
) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist]
will be defaultdict(int)
.
If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist]
it will return 0, which is the return value of the argument of the last defaultdict i.e. int()
.
Upvotes: 883
Reputation: 5441
The parameter to the defaultdict constructor is the function which will be called for building new elements. So let's use a lambda !
>>> from collections import defaultdict
>>> d = defaultdict(lambda : defaultdict(int))
>>> print d[0]
defaultdict(<type 'int'>, {})
>>> print d[0]["x"]
0
Since Python 2.7, there's an even better solution using Counter:
>>> from collections import Counter
>>> c = Counter()
>>> c["goodbye"]+=1
>>> c["and thank you"]=42
>>> c["for the fish"]-=5
>>> c
Counter({'and thank you': 42, 'goodbye': 1, 'for the fish': -5})
Some bonus features
>>> c.most_common()[:2]
[('and thank you', 42), ('goodbye', 1)]
For more information see PyMOTW - Collections - Container data types and Python Documentation - collections
Upvotes: 59
Reputation: 45542
Others have answered correctly your question of how to get the following to work:
for x in stuff:
d[x.a][x.b] += x.c_int
An alternative would be to use tuples for keys:
d = defaultdict(int)
for x in stuff:
d[x.a,x.b] += x.c_int
# ^^^^^^^ tuple key
The nice thing about this approach is that it is simple and can be easily expanded. If you need a mapping three levels deep, just use a three item tuple for the key.
Upvotes: 8
Reputation: 123622
I find it slightly more elegant to use partial
:
import functools
dd_int = functools.partial(defaultdict, int)
defaultdict(dd_int)
Of course, this is the same as a lambda.
Upvotes: 37