fizzer
fizzer

Reputation: 13806

Can I mark an iterator as finished prematurely?

Is there an idiomatic way to finish an iterator early, so that any further next()s raise StopIteration? (I can think of ugly ways, like abusing itertools.takewhile or discarding values until the thing is exhausted).

Edit:

My algorithm takes n iterators of unknown variable length as input. It uses izip_longest() to read one item from each in n-tuples, until all are exhausted. Sometimes I find that I want to stop input early from one of the iterators according to some runtime criterion, and replace it with the stream of default values provided by izip_longest(). The least invasive way I can think of is to 'finish' it somehow.

Upvotes: 3

Views: 420

Answers (5)

fizzer
fizzer

Reputation: 13806

In the end, I opted for abusing itertools.takewhile() after all. It's a bit more succinct than the other answers which use flags to 'consume' the iterator in constant time:

from itertools import takewhile, izip_longest

def f(seqs):
    done = [False] * len(seqs)
    iters = [ takewhile(lambda _, i=i: not done[i], s) for i, s in enumerate(seqs) ]
    zipped = izip_longest(*iters)

    # for example:
    print next(zipped)
    done[1] = True
    print next(zipped)
    print next(zipped)

f((['a', 'b', 'c'], [1, 2, 3], ['foo', 'bar', 'baz']))

Output:

('a', 1, 'foo')
('b', None, 'bar')
('c', None, 'baz')

Upvotes: 0

jme
jme

Reputation: 20725

Here's another answer, which I've decided to post separately because it's of a different flavor than my other one. I think this might be preferable: keep the iterators in an ordered dict mapping each iterator to {True, False} (True if the iterator is active, False otherwise). First we want a function that takes such a dict and calls next on each iterator, returning the default value and updating an iterator's status if it is exhausted:

import itertools
import collections

def deactivating_zipper(iterators, default):
    while True:
        values = []
        for iterator, active in iterators.items():
            if active:
                try:
                    values.append(next(iterator))
                except StopIteration:
                    values.append(default)
                    iterators[iterator] = False
            else:
                values.append(default)

        if not any(iterators.values()):
            return
        else:  
            yield values

So now if we have three iterators:

it_a = iter(["a", "b", "c", "d", "e"])
it_b = iter([1,2,3,4,5,6,7,8])
it_c = iter(["foo", "bar", "baz", "quux"])

iterators = collections.OrderedDict((it, True) for it in (it_a, it_b, it_c))

We can just loop over them thusly:

for a,b,c in deactivating_zipper(iterators, "n/a"):

    # deactivate it_a
    if b == 3:
        iterators[it_a] = False

    print a,b,c

This gives output:

a 1 foo
b 2 bar
c 3 baz
n/a 4 quux
n/a 5 n/a
n/a 6 n/a
n/a 7 n/a
n/a 8 n/a

Upvotes: 1

Joran Beasley
Joran Beasley

Reputation: 113998

class MyIter:
    def __init__(self,what):
       self.what = what
       self.done = False
       self.iter = iter(what)
    def __iter__(self):
       self.done = False
       self.iter = iter(self.what)
    def next(self):
       if self.done: raise StopIteration
       return next(self.iter)

x = MyIter(range(100))
print next(x)
x.done=True
next(x)

but this sounds like a bad idea in general

what you should really do is

for my_iterator in all_iterators:
    for element in my_iterator: #iterate over it 
       if check(element): #if whatever condition is true
           break #then we are done with this iterator on to the next

for the example listed in the comments by @jme use something like this

for i,my_iterator in enumerate(all_iterators):
    for j,element in enumerate(my_iterator): #iterate over it 
       if j > i: #if whatever condition is true
           break #then we are done with this iterator on to the next
       else:
           do_something(element)

Upvotes: 1

jme
jme

Reputation: 20725

In your edit, you give your use-case: you want something that behaves like izip_longest, but allows you to "disable" iterators prematurely. Here's an iterator class that allows this, as well as "enabling" a previously-disabled iterator.

class TerminableZipper(object):

    def __init__(self, iterators, fill="n/a"):
        self.iterators = collections.OrderedDict((it, True) 
                                                 for it in iterators)
        self.fill = fill
        self.zipper = itertools.izip_longest(*iterators, fillvalue=fill)

    def disable(self, iterator):
        self.iterators[iterator] = False
        self._make_iterators()

    def enable(self, iterator):
        self.iterators[iterator] = True
        self._make_iterators()

    def _make_iterators(self):
        def effective(it):
            iterator, active = it
            return iterator if active else iter([])

        effective_iterators = map(effective, self.iterators.items())                
        self.zipper = itertools.izip_longest(*effective_iterators, 
                                             fillvalue=self.fill)

    def __iter__(self):
        return self

    def next(self):
        return next(self.zipper)

An example:

>>> it_a = itertools.repeat(0)
>>> it_b = iter(["a", "b", "c", "d", "e", "f"])
>>> it_c = iter(["q", "r", "x"])
>>> zipper = TerminableZipper([it_a, it_b, it_c])
>>> next(zipper)
(0, 'a', 'q')
>>> next(zipper)
(0, 'b', 'r')
>>> zipper.disable(it_a)
>>> next(zipper)
('n/a', 'c', 'x')
>>> zipper.enable(it_a)
>>> next(zipper)
(0, 'd', 'n/a')

Upvotes: 1

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798814

From itertools Recipes:

def consume(iterator, n):
    "Advance the iterator n-steps ahead. If n is none, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)

Upvotes: 3

Related Questions