vrume21
vrume21

Reputation: 551

Using zip() on two lists compared with two iterable objects

As a disclaimer, I am relatively new to programming, so please excuse any simple oversights I've made in writing this question.

I am using Python 2.7.3 and in doing so, have noticed something that seems unusual to me and haven't been able to find a satisfactory explanation through Google Search or in the Python documentation. I can create a list and use zip() to create a list of tuples like so:

numList = range(4)
print zip(numList, numList)

[(0, 0), (1, 1), (2, 2), (3, 3)]

But when I use the iter() function on numList to create an iterable object and use zip() in a similar manner on this object, I get a much different result:

numList = range(4)
numList = iter(numList)
print zip(numList, numList)

[(0, 1), (2, 3)]

I would appreciate it if someone could explain the difference between the two procedures and what is going on behind the scenes that causes this to happen.

Upvotes: 1

Views: 382

Answers (2)

Levon
Levon

Reputation: 143022

In the first case you provide zip with "two" lists (really the same object i.e, list, but you provide two references to it), so it pulls out the same information from each list alternatingly and "zip"s it. Using the information does not consume it.

In the second case, you create an iterator that would generate the values from 0 to 3 "on the fly"/on demand.

You now provide this iterator twice to zip (again, the same reference to the iterator), which "pulls out" the data alternating from each iterator. (Ie it calls the iterator's next() function).

The "first" iterator provides the first value 0, the "second" iterator (really just a reference to the same one) provides the next value 1, and so on in alternating fashion until the iterator is exhausted at 3.

So the main thing to keep in mind is that what you have provided to zip in both cases are two identical references to the same object. In the first case the list contains all the data already to be interleaved/zipped (accessing/using the data doesn't make it unavailable). In the 2nd case, the data is being generated on demand, so the sequence generated is interleaved/zipped and as each value is generated, it is consumed and no longer available subsequently.

I hope this makes some sense (be easier to sketch out on a piece of paper :)

Upvotes: 6

Kevin Reid
Kevin Reid

Reputation: 43783

When you iterate over an iterator, elements are consumed from it, so each number occurs only once.

When you iterate over the same list twice, two separate list-iterators are made from it, and each produces the numbers once, so overall each number occurs twice.

Your first program is equivalent to doing

print zip(iter(numList), iter(numList))

Upvotes: 3

Related Questions