Reputation: 13410
In this question @lazyr asks how the following code of izip_longest
iterator from here works:
def izip_longest_from_docs(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
yield counter() # yields the fillvalue, or raises IndexError
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
When I was trying to understand how it works I stumbled into the question:
"What if IndexError
is raised inside one of those iterators that are sent to izip_longest
as parameters?".
Then I wrote some testing code:
from itertools import izip_longest, repeat, chain, izip
def izip_longest_from_docs(*args, **kwds):
# The code is exactly the same as shown above
....
def gen1():
for i in range(5):
yield i
def gen2():
for i in range(10):
if i==8:
raise IndexError #simulation IndexError raised inside the iterator
yield i
for i in izip_longest_from_docs(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
print('\n')
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
And it turned out that the function in itertools
module and izip_longest_from_docs
work differently.
The output of the code above:
>>>
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
Traceback (most recent call last):
File "C:/..., line 31, in <module>
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
File "C:/... test_IndexError_inside iterator.py", line 23, in gen2
raise IndexError
IndexError
So, it's clearly seen, that the code of izip_longes
from itertools
did propagate IndexError
exception (as I think it should), but izip_longes_from_docs
'swallowed' IndexError
exception as it took it as a signal from sentinel
to stop iterating.
My question is, how did they worked around IndexError
propagation in the code in theitertools
module?
Upvotes: 4
Views: 780
Reputation: 176910
in izip_longest_next
in the code of izip_longest
, no sentinel is used.
Instead, CPython keeps track of how many of the iterators are still active with a counter, and stops when the number active reaches zero.
If an error occurs, it ends iteration as if there are no iterators still active, and allows the error to propagate.
The code:
item = PyIter_Next(it);
if (item == NULL) {
lz->numactive -= 1;
if (lz->numactive == 0 || PyErr_Occurred()) {
lz->numactive = 0;
Py_DECREF(result);
return NULL;
} else {
Py_INCREF(lz->fillvalue);
item = lz->fillvalue;
PyTuple_SET_ITEM(lz->ittuple, i, NULL);
Py_DECREF(it);
}
}
The simplest solution I see:
def izip_longest_modified(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
class LongestExhausted(Exception):
pass
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
try:
yield counter() # yields the fillvalue, or raises IndexError
except:
raise LongestExhausted
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except LongestExhausted:
pass
Upvotes: 3