Reputation: 49
Here's a problem I get on an interview (python 3.7):
def add(x,y):
return x+y
g = (x for x in range(4))
for n in [1,10]:
g = (add(n,i) for i in g)
list(g)
What does list(g) print? Answer is
20,21,22,23
From the output, I guess what happened is the add function looped twice, and both of the time n=10? Can someone explain to me what happens step by step? I am so confused. Much thanks.
Upvotes: 4
Views: 109
Reputation: 3503
Because g
is generator object.
Unlike listcomp which is calculated immediately, it's just a generator instance waiting to be iterated.
>>> from inspect import getgeneratorstate
>>> g = (x for x in range(4))
>>> getgeneratorstate(g)
'GEN_CREATED'
>>> next(g)
0
>>> getgeneratorstate(g)
'GEN_SUSPENDED'
>>> list(g)
[1, 2, 3]
>>> getgeneratorstate(g)
'GEN_CLOSED'
However, reference to first generator (x for x in range(4))
does not change inside generator object. Because g
is just a reference on a object on memory.
Name is merely a post-it on a box. - Fluent Python.
So when we pass g
, mere memory address of referencing object is passed, not g
itself. Therefore, in following case:
>>> g = (x for x in range(4))
>>> g
<generator object <genexpr> at 0x036babbc>
>>> g = (add(n, i) for i in g)
g
inside generator expression (add(n, i) for i in g)
is merely passing mem-address 0x036babbc
to expression, and generator instance created from that expression remembers that address, so even if g
is redeclared that does not affect already created generator instances.
So in sequenece:
>>> g = (x for x in range(4))
>>> g
<generator object <genexpr> at 0x0452c22c> # 1
>>> g = (add(10, i) for i in g)
>>> g
<generator object <genexpr> at 0x044ee178> # 2
>>> g.gi_frame.f_locals['.0']
<generator object <genexpr> at 0x0452c22c> # 1 stored
>>> g = (add(10, i) for i in g)
>>> g
<generator object <genexpr> at 0x03bd88c8> # 3
>>> g.gi_frame.f_locals['.0']
<generator object <genexpr> at 0x044ee178> # 2 stored
As you see, each generator expressions remembers last referenced generator instances, so it's keep getting nested.
Upvotes: 1
Reputation: 531165
The "body" of a generator expression does not capture values in a closure, so n
is just a free variable whose value is whatever is assigned to n
once g
is evaluated. (The expression iterated over is, so g
is not a free variable, but the iterable currently assigned to g
.)
That is, after for
loop, you have
assert n == 10 # The last value assigned to n
# Pseudocode - every time n is used, it resolves to the *current*
# value of n, not the value n had when the generator expression was
# defined.
g = (add(10, i) for i in (add(10, i) for i in (x for x in range(4))))
# *not* (add(10, i) for i in (add(1, i) for i in (x for x in range(4))))
= (add(10, i) for i in (add(10, i) for i in (0, 1, 2, 3)))
= (add(10, i) for i in (10, 11, 12, 13))
= (10 + i for i in (10, 11, 12, 13)
And so
list(g) == [20, 21, 22, 23]
Upvotes: 3