Reputation: 3245
First of all, this post does NOT answer my question or give me any guide to answer my question at all.
My question is about mechanism function resolving non-local variables.
Code
# code block 1
def func():
vals = [0, 0, 0]
other_vals = [7, 8, 9]
other = 12
def func1():
vals[1] += 1
print(vals)
def func2():
vals[2] += 2
print vals
return (func1, func2)
f1, f2 = func()
Try to run f1
, f2
:
>>> f1()
[0, 1, 0]
>>> f2
[0, 1, 2]
This shows that the object previously referred by vals
are shared by f1
and f2
, and not garbage collected after execution of func
.
Will objects referred by other_vals
and other
be garbage collected? I think so. But how does Python decide not to garbage collect vals
?
Assumption 1
Python interpreter will resolve variable names within func1
and func2
to figure out references inside the function, and increase the reference count of [0, 0, 0]
by 1 preventing it from garbage collection after the func
call.
But if I do
# code block 2
def outerfunc():
def innerfunc():
print(non_existent_variable)
f = outerfunc()
No error reported. Further more
# code block 3
def my_func():
print(yet_to_define)
yet_to_define = "hello"
works.
Assumption 2
Variable names are resolved dynamically at run time. This makes observations in code block 2 and 3 easy to explain, but how did the interpreter know it need to increase reference count of [0, 0, 0]
in code block 1?
Which assumption is correct?
Upvotes: 2
Views: 294
Reputation: 55469
Your first example creates a closure; also see Why aren't python nested functions called closures?, Can you explain closures (as they relate to Python)?, and What exactly is contained within a obj.__closure__
?.
The closure mechanism ensures that the interpreter stores a reference to vals
in the returned function objects func1
and func2
. Your Assumption 1 is correct: that reference prevents vals
from being garbage collected when func
returns.
In your second example, the interpreter cannot see a reference to non_existent_variable
in the enclosing scope(s), but that's ok because your Assumption 2 is also correct, so you're free to use names that haven't yet been bound to objects at function declaration time, so long as the name is in scope when you actually call the function.
The answer to "how did the interpreter know it need to increase reference count of [0, 0, 0]
in code block 1?" is that the closure mechanism is an explicit thing the interpreter does when it executes a function definition, i.e., when it's creating a function object from the function definition in your script.
Every Python function object (both normal def
-style functions and lambda
s) has an attribute to store this closure information, with a minor difference between Python 2 and Python 3. See the links at the start of this answer for details, but I will mention here that Python 3 provides the nonlocal
keyword, which works a bit like the global
keyword: nonlocal
allows you to make assignments to closed-over simple variables; J.F. Sebastian's answer has a simple example illustrating the use of nonlocal
.
Note that with nested functions the inner function definitions are processed each time you call the outer function, which allows you to do things like:
def func(vals):
def func1():
vals[1] += 1
print(vals)
def func2():
vals[2] += 2
print(vals)
return func1, func2
f1, f2 = func([0, 0, 0])
f1()
f2()
f1, f2 = func([10, 20, 30])
f1()
f2()
output
[0, 1, 0]
[0, 1, 2]
[10, 21, 30]
[10, 21, 32]
Upvotes: 3