Reputation: 189
I saw this example at pythontips. I do not understand the second line when defaultdict takes an argument "tree" and return a "tree".
import collections
tree = lambda: collections.defaultdict(tree)
some_dict = tree()
some_dict['color']['favor'] = "yellow"
# Works fine
After I run this code, I checked the type of some_dict
defaultdict(< function < lambda > at 0x7f19ae634048 >,
{'color': defaultdict(
< function < lambda > at 0x7f19ae634048 >, {'favor': 'yellow'})})
Upvotes: 8
Views: 1581
Reputation: 283
This is a pretty clever way to create a recursive defaultdict
. It's a little tricky to understand at first but once you dig into what's happening, it's actually a pretty simple use of recursion.
In this example, we define a recursive lambda function, tree
, that returns a defaultdict
whose constructor is tree
. Let's rewrite this using regular functions for clarity.
from collections import defaultdict
from pprint import pprint
def get_recursive_dict():
return defaultdict(get_recursive_dict)
Note that we're returning defaultdict(get_recursive_dict)
and not defaultdict(get_recursive_dict())
. We want to pass defaultdict
a callable object (i.e. the function get_recursive_dict
). Actually calling get_recursive_dict()
would result in infinite recursion.
If we call get_recursive_dict
, we get an empty defaultdict
whose default value is the function get_recursive_dict
.
recursive_dict = get_recursive_dict()
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004FFC4A8>, {})
Let's see this in action. Create the key 'alice'
and it's corresponding value defaults to an empty defaultdict
whose default value is the function get_recursive_dict
. Notice that this is the same default value as our recursive_dict
!
print(recursive_dict['alice'])
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {'alice': defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})})
So we can create as many nested dictionaries as we want.
recursive_dict['bob']['age'] = 2
recursive_dict['charlie']['food']['dessert'] = 'cake'
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'charlie': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'food': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'dessert': 'cake'})}), 'bob': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'age': 2}), 'alice': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {})})
Once you overwrite the default value with a key, you can no longer create arbitrarily deep nested dictionaries.
recursive_dict['bob']['age']['year'] = 2016
# TypeError: 'int' object does not support item assignment
I hope this clears things up!
Upvotes: 6
Reputation: 5784
It's easier to see if you try this: a = lambda: a
, you'll see that a()
returns a
. So...
>>> a = lambda: a
>>> a()()()()
<function <lambda> at 0x102bffd08>
They're doing this with the defaultdict
too. tree
is a function returning a defaultdict whose default value is yet another defaultdict, and so on.
I wasn't actually aware of this either. I thought tree
would have to be defined first. Maybe it's a special Python rule? (EDIT:) No, I forgot that Python does the name lookup at runtime, and tree
already points to the lambda then. In C++ there's compile-time reference checking, but you can define functions that reference themselves.
It seems like a way to create behavior that some users wouldn't expect. Like say you accidentally redefine tree
later, your defaultdict is broken:
>>> import collections
>>> tree = lambda: collections.defaultdict(tree)
>>> some_dict = tree()
>>> tree = 4
>>> some_dict[4][3] = 2 # TypeError: first argument must be callable or None
Upvotes: 1
Reputation: 164613
Two points to note:
lambda
represents an anonymous function.So here are 2 different ways to define functionally identical objects. They are recursive functions because they reference themselves.
from collections import defaultdict
# anonymous
tree = lambda: defaultdict(tree)
# explicit
def tree(): return defaultdict(tree)
Running the final 2 lines with these different definitions in turn, you see only a subtle difference in the naming of the defaultdict
type:
# anonymous
defaultdict(<function __main__.<lambda>()>,
{'color': defaultdict(<function __main__.<lambda>()>,
{'favor': 'yellow'})})
# explicit
defaultdict(<function __main__.tree()>,
{'color': defaultdict(<function __main__.tree()>,
{'favor': 'yellow'})})
Upvotes: 3