Konrad
Konrad

Reputation: 2287

Converting iterator to list changes the iterator

The following code produces some very strange results

import re
string = "test test test"
positions = re.finditer("test", string)
print (list(positions))
print (list(positions))

Output:

[<Match object...>, <Match object...>, <Match object...>]
[]

Now, I think I know what's going on here. The first list call "exhausts" the iterator (so it "uses the iterator up", like in a generator, in the process of creating a list from the iterator) so then when the second call to list is made, the iterator is gone and we get an empty list. This seems to be confirmed by the paragraph below, although I am am trying to understand some of the things they are saying here, so I am not entirely happy with this explanation (if it is the right one):

A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.

The above paragraph is from the official documentation.

I do not really understand what they are saying in the first sentence in the above paragraph, especially with regards to passing to the iter() function, and I do not know how they connect usage in a for loop to a list producing a fresh new iterator. The second sentence, though, seems closer to what I first thought was going on in the code above.

If anyone can help me clear up the confusion here, I would appreciate it immensely.

NOTE:

I am using Python 3.5.1

Upvotes: 2

Views: 2056

Answers (2)

Amit Gold
Amit Gold

Reputation: 767

When list() runs over an iterator, it basically calls next() on it until it raises the exception StopIteration and then appends each thing that the next() returns to a list, then returns that list. Basically, an implementation of list(iter) could be:

my_list(iter):
    output = []
    try:
        while True:
            output.append(iter.next())
    except StopIteration:
        return output

By the way, a for loop does the exact same thing, but instead of output.append(iter.next()) it does the body of the loop.

Upvotes: 0

wim
wim

Reputation: 362478

This line:

positions = re.finditer("test", string)

It returns a one-shot iterator. Then you called list(positions) twice on the same iterator.

So for the second call, it was already exhausted.

Lists will give you a fresh iterator every time you iterate them, so there is no exhausting behaviour on a list itself. Compare the behaviour below to understand the piece of documentation you quoted:

>>> L = ['a', 'b', 'c']
>>> list(L)
['a', 'b', 'c']
>>> list(L)
['a', 'b', 'c']
>>> iter_L = iter(L)  # calls L.__iter__() and returns you a one-shot iterator
>>> list(iter_L)
['a', 'b', 'c']
>>> list(iter_L)
[]

Upvotes: 1

Related Questions