user3375672
user3375672

Reputation: 3768

Python consume an iterator pair-wise

I am trying to understand Python's iterators in the context of the pysam module. By using the fetch method on a so called AlignmentFile class one get a proper iterator iter consisting of records from the file file. I can the use various methods to access each record (iterable), for instance the name with query_name:

import pysam
iter = pysam.AlignmentFile(file, "rb", check_sq=False).fetch(until_eof=True)
for record in iter:
  print(record.query_name)

It happens that records come in pairs so that one would like something like:

while True:
  r1 = iter.__next__() 
  r2 = iter.__next__()
  print(r1.query_name)     
  print(r2.query_name)

Calling next() is probably not the right way for million of records, but how can one use a for loop to consume the same iterator in pairs of iterables. I looked at the grouper recipe from itertools and the SOs Iterate an iterator by chunks (of n) in Python? [duplicate] (even a duplicate!) and What is the most “pythonic” way to iterate over a list in chunks? but cannot get it to work.

Upvotes: 2

Views: 267

Answers (1)

timgeb
timgeb

Reputation: 78790

First of all, don't use the variable name iter, because that's already the name of a builtin function.

To answer your question, simply use itertools.izip (Python 2) or zip (Python 3) on the iterator.

Your code may look as simple as

for next_1, next_2 in zip(iterator, iterator):
    # stuff

edit: whoops, my original answer was the correct one all along, don't mind the itertools recipe.

edit 2: Consider itertools.izip_longest if you deal with iterators that could yield an uneven amount of objects:

>>> from itertools import izip_longest
>>> iterator = (x for x in (1,2,3))
>>> 
>>> for next_1, next_2 in izip_longest(iterator, iterator):
...     next_1, next_2
... 
(1, 2)
(3, None)

Upvotes: 3

Related Questions