Z-Jiang
Z-Jiang

Reputation: 189

don't understand this lambda expression with defaultdict

I saw this example at pythontips. I do not understand the second line when defaultdict takes an argument "tree" and return a "tree".

import collections
tree = lambda: collections.defaultdict(tree)
some_dict = tree()
some_dict['color']['favor'] = "yellow"
# Works fine

After I run this code, I checked the type of some_dict

defaultdict(< function < lambda > at 0x7f19ae634048 >, 
            {'color': defaultdict(
                  < function < lambda > at 0x7f19ae634048 >, {'favor': 'yellow'})})

Upvotes: 8

Views: 1581

Answers (3)

Daniel Ong
Daniel Ong

Reputation: 283

This is a pretty clever way to create a recursive defaultdict. It's a little tricky to understand at first but once you dig into what's happening, it's actually a pretty simple use of recursion.

In this example, we define a recursive lambda function, tree, that returns a defaultdict whose constructor is tree. Let's rewrite this using regular functions for clarity.

from collections import defaultdict
from pprint import pprint

def get_recursive_dict():
    return defaultdict(get_recursive_dict)

Note that we're returning defaultdict(get_recursive_dict) and not defaultdict(get_recursive_dict()). We want to pass defaultdict a callable object (i.e. the function get_recursive_dict). Actually calling get_recursive_dict() would result in infinite recursion.

If we call get_recursive_dict, we get an empty defaultdict whose default value is the function get_recursive_dict.

recursive_dict = get_recursive_dict()
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004FFC4A8>, {})

Let's see this in action. Create the key 'alice' and it's corresponding value defaults to an empty defaultdict whose default value is the function get_recursive_dict. Notice that this is the same default value as our recursive_dict!

print(recursive_dict['alice'])
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {'alice': defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})})

So we can create as many nested dictionaries as we want.

recursive_dict['bob']['age'] = 2
recursive_dict['charlie']['food']['dessert'] = 'cake'
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'charlie': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'food': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'dessert': 'cake'})}), 'bob': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'age': 2}), 'alice': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {})})

Once you overwrite the default value with a key, you can no longer create arbitrarily deep nested dictionaries.

recursive_dict['bob']['age']['year'] = 2016
# TypeError: 'int' object does not support item assignment

I hope this clears things up!

Upvotes: 6

sudo
sudo

Reputation: 5784

It's easier to see if you try this: a = lambda: a, you'll see that a() returns a. So...

>>> a = lambda: a
>>> a()()()()
<function <lambda> at 0x102bffd08>

They're doing this with the defaultdict too. tree is a function returning a defaultdict whose default value is yet another defaultdict, and so on.

I wasn't actually aware of this either. I thought tree would have to be defined first. Maybe it's a special Python rule? (EDIT:) No, I forgot that Python does the name lookup at runtime, and tree already points to the lambda then. In C++ there's compile-time reference checking, but you can define functions that reference themselves.

It seems like a way to create behavior that some users wouldn't expect. Like say you accidentally redefine tree later, your defaultdict is broken:

>>> import collections
>>> tree = lambda: collections.defaultdict(tree)
>>> some_dict = tree()
>>> tree = 4
>>> some_dict[4][3] = 2 # TypeError: first argument must be callable or None

Upvotes: 1

jpp
jpp

Reputation: 164613

Two points to note:

  1. lambda represents an anonymous function.
  2. Functions are first-class objects in Python. They may be assigned to a variable like any other object.

So here are 2 different ways to define functionally identical objects. They are recursive functions because they reference themselves.

from collections import defaultdict

# anonymous
tree = lambda: defaultdict(tree)

# explicit
def tree(): return defaultdict(tree)

Running the final 2 lines with these different definitions in turn, you see only a subtle difference in the naming of the defaultdict type:

# anonymous
defaultdict(<function __main__.<lambda>()>,
            {'color': defaultdict(<function __main__.<lambda>()>,
                         {'favor': 'yellow'})})

# explicit
defaultdict(<function __main__.tree()>,
            {'color': defaultdict(<function __main__.tree()>,
                         {'favor': 'yellow'})})

Upvotes: 3

Related Questions