Reputation: 618
My understanding of yield from
is that it is similar to yield
ing every item from an iterable. Yet, I observe the different behavior in the following example.
I have Class1
class Class1:
def __init__(self, gen):
self.gen = gen
def __iter__(self):
for el in self.gen:
yield el
and Class2 that different only in replacing yield
in for loop with yield from
class Class2:
def __init__(self, gen):
self.gen = gen
def __iter__(self):
yield from self.gen
The code below reads the first element from an instance of a given class and then reads the rest in a for loop:
a = Class1((i for i in range(3)))
print(next(iter(a)))
for el in iter(a):
print(el)
This produces different outputs for Class1
and Class2
. For Class1
the output is
0
1
2
and for Class2
the output is
0
What is the mechanism behind yield from
that produces different behavior?
Upvotes: 32
Views: 4356
Reputation: 110301
updated
I don't see it as that complicated, and the resulting behavior can be seen as actually unsurprising.
When the iterator goes out of scope, Python will throw a "GeneratorExit" exception in the (innermost) generator.
On the "classic" for
form, the exception happens in the user-written __iter__
method, is not catch, and is suppressed when bubbling up by the generator mechanisms.
On the yield from
form, the same exception is thrown in the inner self.gen
, thus "killing" it, and bubbles up to the user-written __iter__
.
Writing another intermediate generator can make this easily visible:
def inner_gen(gen):
try:
for item in gen:
yield item
except GeneratorExit:
print("Generator exit thrown in inner generator")
class Class1:
def __init__(self, gen):
self.gen = inner_gen(gen)
def __iter__(self):
try:
for el in self.gen:
yield el
except GeneratorExit:
print("Generator exit thrown in outer generator for 'classic' form")
class Class2(Class1):
def __iter__(self):
try:
yield from self.gen
except GeneratorExit as exit:
print("Generator exit thrown in outer generator for 'yield from' form" )
first = lambda g:next(iter(g))
And now:
In [324]: c1 = Class1((i for i in range(2)))
In [325]: first(c1)
Generator exit thrown in outer generator for 'classic' form
Out[325]: 0
In [326]: first(c1)
Generator exit thrown in outer generator for 'classic' form
Out[326]: 1
In [327]: c2 = Class2((i for i in range(2)))
In [328]: first(c2)
Generator exit thrown in inner generator
Generator exit thrown in outter generator for 'yield from' form
Out[328]: 0
In [329]: first(c2)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[329], line 1
(...)
StopIteration:
update
I had a previous answer text speculating how the call to close
would take place, skipping the intermediate generator - it is not that simple regarding close
though: Python will always call __del__
- not close
, which is only called by the user, or in certain circunstances that were hard to pin down. But it will always throw the GeneratorExit
exception in a generator-function body (not in a class with explict __next__
and throw
, though - let's skip this for another question :-D )
Upvotes: 2
Reputation: 18796
When you use next(iter(instance_of_Class2))
, iter()
calls .close()
on the inner generator when it (the iterator, not the generator!) goes out of scope (and is deleted), while with Class1
, iter()
only closes its instance
>>> g = (i for i in range(3))
>>> b = Class2(g)
>>> i = iter(b) # hold iterator open
>>> next(i)
0
>>> next(i)
1
>>> del(i) # closes g
>>> next(iter(b))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
This behavior is described in PEP 342 in two parts
.close()
method (well, new to Python 2.5)
- Add support to ensure that close() is called when a generator iterator is garbage-collected.
What happens is a little clearer (if perhaps surprising) when multiple generator delegations occur; only the generator being delegated is closed when its wrapping iter
is deleted
>>> g1 = (a for a in range(10))
>>> g2 = (a for a in range(10, 20))
>>> def test3():
... yield from g1
... yield from g2
...
>>> next(test3())
0
>>> next(test3())
10
>>> next(test3())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Class2
What options are there to make Class2
behave more the way you expect?
Notably, other strategies, though they don't have the visually pleasing sugar of yield from
or some of its potential benefits gives you a way to interact with the values, which seems like a primary benefit
>>> class Class3:
... def __init__(self, gen):
... self.iterator = iter(gen)
...
... def __iter__(self):
... return self.iterator
...
>>> c = Class3((i for i in range(3)))
>>> next(iter(c))
0
>>> next(iter(c))
1
iter()
inconsistency - see comments below (ie. why isn't e
closed?)itertools.chain.from_iterable
>>> class Class5(collections.abc.Generator):
... def __init__(self, gen):
... self.gen = gen
... def send(self, value):
... return next(self.gen)
... def throw(self, value):
... raise StopIteration
... def close(self): # optional, but more complete
... self.gen.close()
...
>>> e = Class5((i for i in range(10)))
>>> next(e) # NOTE iter is not necessary!
0
>>> next(e)
1
>>> next(iter(e)) # but still works
2
>>> next(iter(e)) # doesn't close e?? (should it?)
3
>>> e.close()
>>> next(e)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/_collections_abc.py", line 330, in __next__
return self.send(None)
File "<stdin>", line 5, in send
StopIteration
A better clue is that if you directly try again, next(iter(instance))
raises StopIteration
, indicating the generator is permanently closed (either through exhaustion or .close()
), and why iterating over it with a for
loop yields no more values
>>> a = Class1((i for i in range(3)))
>>> next(iter(a))
0
>>> next(iter(a))
1
>>> b = Class2((i for i in range(3)))
>>> next(iter(b))
0
>>> next(iter(b))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
However, if we name the iterator, it works as expected
>>> b = Class2((i for i in range(3)))
>>> i = iter(b)
>>> next(i)
0
>>> next(i)
1
>>> j = iter(b)
>>> next(j)
2
>>> next(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
To me, this suggests that when the iterator doesn't have a name, it calls .close()
when it goes out of scope
>>> def gen_test(iterable):
... yield from iterable
...
>>> g = gen_test((i for i in range(3)))
>>> next(iter(g))
0
>>> g.close()
>>> next(iter(g))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Disassembling the result, we find the internals are a little different
>>> a = Class1((i for i in range(3)))
>>> dis.dis(a.__iter__)
6 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (gen)
4 GET_ITER
>> 6 FOR_ITER 10 (to 18)
8 STORE_FAST 1 (el)
7 10 LOAD_FAST 1 (el)
12 YIELD_VALUE
14 POP_TOP
16 JUMP_ABSOLUTE 6
>> 18 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>> b = Class2((i for i in range(3)))
>>> dis.dis(b.__iter__)
6 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (gen)
4 GET_YIELD_FROM_ITER
6 LOAD_CONST 0 (None)
8
10 POP_TOP
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
Notably, the yield from
version has GET_YIELD_FROM_ITER
If
TOS
is a generator iterator or coroutine object it is left as is. Otherwise, implementsTOS = iter(TOS)
.
(subtly, YIELD_FROM
keyword appears to be removed in 3.11)
So if the given iterable (to the class) is a generator iterator, it'll be handed off directly, giving the result we (might) expect
Passing an iterator which isn't a generator (iter()
creates a new iterator each time in both cases)
>>> a = Class1([i for i in range(3)])
>>> next(iter(a))
0
>>> next(iter(a))
0
>>> b = Class2([i for i in range(3)])
>>> next(iter(b))
0
>>> next(iter(b))
0
Expressly closing Class1
's internal generator
>>> g = (i for i in range(3))
>>> a = Class1(g)
>>> next(iter(a))
0
>>> next(iter(a))
1
>>> a.gen.close()
>>> next(iter(a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
generator is only closed by iter
when deleted if instance is popped
>>> g = (i for i in range(10))
>>> b = Class2(g)
>>> i = iter(b)
>>> next(i)
0
>>> j = iter(b)
>>> del(j) # next() not called on j
>>> next(i)
1
>>> j = iter(b)
>>> next(j)
2
>>> del(j) # generator closed
>>> next(i) # now fails, despite range(10) above
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Upvotes: 27