Victor Yan
Victor Yan

Reputation: 3509

Why are exceptions within a Python generator not caught?

I have the following experimental code whose function is similar to the zip built-in. What it tries to do should have been simple and clear, trying to return the zipped tuples one at a time until an IndexError occurs when we stop the generator.

def my_zip(*args):
    i = 0
    while True:
        try:
            yield (arg[i] for arg in args)
        except IndexError:
            raise StopIteration
        i += 1

However, when I tried to execute the following code, the IndexError was not caught but instead thrown by the generator:

gen = my_zip([1,2], ['a','b'])
print(list(next(gen)))
print(list(next(gen)))
print(list(next(gen)))


IndexError                                Traceback (most recent call last)
I:\Software\WinPython-32bit-3.4.2.4\python-3.4.2\my\temp2.py in <module>()
     12 print(list(next(gen)))
     13 print(list(next(gen)))
---> 14 print(list(next(gen)))

I:\Software\WinPython-32bit-3.4.2.4\python-3.4.2\my\temp2.py in <genexpr>(.0)
      3     while True:
      4         try:
----> 5             yield (arg[i] for arg in args)
      6         except IndexError:
      7             raise StopIteration
IndexError: list index out of range

Why is this happening?

Edit:

Thanks @thefourtheye for providing a nice explanation for what's happening above. Now another problem occurs when I execute:

list(my_zip([1,2], ['a','b']))

This line never returns and seems to hang the machine. What's happening now?

Upvotes: 23

Views: 3285

Answers (4)

warvariuc
warvariuc

Reputation: 59594

def my_zip(*args):
    i = 0
    while True:
        try:
            yield (arg[i] for arg in args)
        except IndexError:
            raise StopIteration
        i += 1

IndexError is not caught, because (arg[i] for arg in args) is a generator which is not executed immediately, but when you start iterating over it. And you iterate over it in another scope, when you call list((arg[i] for arg in args)):

# get the generator which yields another generator on each iteration
gen = my_zip([1,2], ['a','b'])
# get the second generator `(arg[i] for arg in args)` from the first one
# then iterate over it: list((arg[i] for arg in args))
print(list(next(gen)))
  • On the first list(next(gen)) i equals 0.
  • On the second list(next(gen)) i equals 1.
  • On the third list(next(gen)) i equals 2. And here you get IndexError -- in the outer scope. The line is treated as list(arg[2] for arg in ([1,2], ['a','b']))

Upvotes: 2

thefourtheye
thefourtheye

Reputation: 239453

The yield yields a generator object everytime and when the generators were created there was no problem at all. That is why try...except in my_zip is not catching anything. The third time when you executed it,

list(arg[2] for arg in args)

this is how it got reduced to (over simplified for our understanding) and now, observe carefully, list is iterating the generator, not the actual my_zip generator. Now, list calls next on the generator object and arg[2] is evaluated, only to find that 2 is not a valid index for arg (which is [1, 2] in this case), so IndexError is raised, and list fails to handle it (it has no reason to handle that anyway) and so it fails.


As per the edit,

list(my_zip([1,2], ['a','b']))

will be evaluated like this. First, my_zip will be called and that will give you a generator object. Then iterate it with list. It calls next on it, and it gets another generator object list(arg[0] for arg in args). Since there is no exception or return encountered, it will call next, to get another generator object list(arg[1] for arg in args) and it keeps on iterating. Remember, the yielded generators are never iterated, so we ll never get the IndexError. That is why the code runs infinitely.

You can confirm this like this,

from itertools import islice
from pprint import pprint
pprint(list(islice(my_zip([1, 2], ["a", 'b']), 10)))

and you will get

[<generator object <genexpr> at 0x7f4d0a709678>,
 <generator object <genexpr> at 0x7f4d0a7096c0>,
 <generator object <genexpr> at 0x7f4d0a7099d8>,
 <generator object <genexpr> at 0x7f4d0a709990>,
 <generator object <genexpr> at 0x7f4d0a7095a0>,
 <generator object <genexpr> at 0x7f4d0a709510>,
 <generator object <genexpr> at 0x7f4d0a7095e8>,
 <generator object <genexpr> at 0x7f4d0a71c708>,
 <generator object <genexpr> at 0x7f4d0a71c750>,
 <generator object <genexpr> at 0x7f4d0a71c798>]

So the code tries to build an infinite list of generator objects.

Upvotes: 13

mhawke
mhawke

Reputation: 87064

Sorry, I'm not able to offer a coherent explanation regarding the failure to catch the exception, however, there's an easy way around it; use a for loop over the length of the shortest sequence:

def my_zip(*args):
    for i in range(min(len(arg) for arg in args)):
        yield (arg[i] for arg in args)

>>> gen = my_zip([1,2], ["a",'b','c'])
>>> print(list(next(gen)))
[1, 'a']
>>> print(list(next(gen)))
[2, 'b']
>>> print(list(next(gen)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Upvotes: 1

ForceBru
ForceBru

Reputation: 44838

Try replacing yield (arg[i] for ...) with the following.

for arg in args:
    yield arg[i]

But in case of numbers that causes an exception as 1[1] makes no sense. I suggest replacing arg[i] just with arg.

Upvotes: 0

Related Questions