David Eyk
David Eyk

Reputation: 12541

How do I merge two python iterators?

I have two iterators, a list and an itertools.count object (i.e. an infinite value generator). I would like to merge these two into a resulting iterator that will alternate yield values between the two:

>>> import itertools
>>> c = itertools.count(1)
>>> items = ['foo', 'bar']
>>> merged = imerge(items, c)  # the mythical "imerge"
>>> merged.next()
'foo'
>>> merged.next()
1
>>> merged.next()
'bar'
>>> merged.next()
2
>>> merged.next()
Traceback (most recent call last):
    ...
StopIteration

What is the simplest, most concise way to do this?

Upvotes: 31

Views: 34812

Answers (13)

Tom Swirly
Tom Swirly

Reputation: 2790

I also agree that itertools is not needed.

But why stop at 2?

  def tmerge(*iterators):
    for values in zip(*iterators):
      for value in values:
        yield value

handles any number of iterators from 0 on upwards.

UPDATE: DOH! A commenter pointed out that this won't work unless all the iterators are the same length.

The correct code is:

def tmerge(*iterators):
  empty = {}
  for values in itertools.zip_longest(*iterators, fillvalue=empty):
    for value in values:
      if value is not empty:
        yield value

and yes, I just tried it with lists of unequal length, and a list containing {}.

Upvotes: 15

user26294
user26294

Reputation: 5652

Using itertools.izip(), instead of zip() as in some of the other answers, will improve performance:

As "pydoc itertools.izip" shows:

Works like the zip() function but consumes less memory by returning an iterator instead of a list.

Itertools.izip will also work properly even if one of the iterators is infinite.

Upvotes: 0

user76284
user76284

Reputation: 1328

Here is an elegant solution:

def alternate(*iterators):
    while len(iterators) > 0:
        try:
            yield next(iterators[0])
            # Move this iterator to the back of the queue
            iterators = iterators[1:] + iterators[:1]
        except StopIteration:
            # Remove this iterator from the queue completely
            iterators = iterators[1:]

Using an actual queue for better performance (as suggested by David):

from collections import deque

def alternate(*iterators):
    queue = deque(iterators)
    while len(queue) > 0:
        iterator = queue.popleft()
        try:
            yield next(iterator)
            queue.append(iterator)
        except StopIteration:
            pass

It works even when some iterators are finite and others are infinite:

from itertools import count

for n in alternate(count(), iter(range(3)), count(100)):
    input(n)

Prints:

0
0
100
1
1
101
2
2
102
3
103
4
104
5
105
6
106

It also correctly stops if/when all iterators have been exhausted.

If you want to handle non-iterator iterables, like lists, you can use

def alternate(*iterables):
    queue = deque(map(iter, iterables))
    ...

Upvotes: 3

ᅠᅠᅠ
ᅠᅠᅠ

Reputation: 67010

One of the less well known features of Python is that you can have more for clauses in a generator expression. Very useful for flattening nested lists, like those you get from zip()/izip().

def imerge(*iterators):
    return (value for row in itertools.izip(*iterators) for value in row)

Upvotes: 4

vampolo
vampolo

Reputation: 107

I prefer this other way which is much more concise:

iter = reduce(lambda x,y: itertools.chain(x,y), iters)

Upvotes: 6

Thomas Moran
Thomas Moran

Reputation:

A concise method is to use a generator expression with itertools.cycle(). It avoids creating a long chain() of tuples.

generator = (it.next() for it in itertools.cycle([i1, i2]))

Upvotes: 1

David Locke
David Locke

Reputation: 18084

You can do something that is almost exaclty what @Pramod first suggested.

def izipmerge(a, b):
  for i, j in itertools.izip(a,b):
    yield i
    yield j

The advantage of this approach is that you won't run out of memory if both a and b are infinite.

Upvotes: 16

A. Coady
A. Coady

Reputation: 57378

Use izip and chain together:

>>> list(itertools.chain.from_iterable(itertools.izip(items, c))) # 2.6 only
['foo', 1, 'bar', 2]

>>> list(itertools.chain(*itertools.izip(items, c)))
['foo', 1, 'bar', 2]

Upvotes: 1

John Fouhy
John Fouhy

Reputation: 42193

I'm not sure what your application is, but you might find the enumerate() function more useful.

>>> items = ['foo', 'bar', 'baz']
>>> for i, item in enumerate(items):
...  print item
...  print i
... 
foo
0
bar
1
baz
2

Upvotes: 3

Andrea Ambu
Andrea Ambu

Reputation: 39556

Why is itertools needed?

def imerge(a,b):
    for i,j in zip(a,b):
        yield i
        yield j

In this case at least one of a or b must be of finite length, cause zip will return a list, not an iterator. If you need an iterator as output then you can go for the Claudiu solution.

Upvotes: 0

Pramod
Pramod

Reputation: 9466

A generator will solve your problem nicely.

def imerge(a, b):
    for i, j in itertools.izip(a,b):
        yield i
        yield j

Upvotes: 46

Claudiu
Claudiu

Reputation: 229491

I'd do something like this. This will be most time and space efficient, since you won't have the overhead of zipping objects together. This will also work if both a and b are infinite.

def imerge(a, b):
    i1 = iter(a)
    i2 = iter(b)
    while True:
        try:
            yield i1.next()
            yield i2.next()
        except StopIteration:
            return

Upvotes: 12

Claudiu
Claudiu

Reputation: 229491

You can use zip as well as itertools.chain. This will only work if the first list is finite:

merge=itertools.chain(*[iter(i) for i in zip(['foo', 'bar'], itertools.count(1))])

Upvotes: 11

Related Questions