Reputation: 1481
I've encountered this code from Most pythonic way of counting matching elements in something iterable
r = xrange(1, 10)
print sum(1 for v in r if v % 2 == 0) # 4
print sum(1 for v in r if v % 3 == 0) # 3
r is iterated once. and then it's iterated again. I thought if an iterator is once consumed then it's over and it should not be iterated again.
Generator expressions can be iterated only once:
r = (7 * i for i in xrange(1, 10))
print sum(1 for v in r if v % 2 == 0) # 4
print sum(1 for v in r if v % 3 == 0) # 0
enumerate(L) too:
r = enumerate(mylist)
and file object too:
f = open(myfilename, 'r')
Why does xrange behave differently?
Upvotes: 29
Views: 1565
Reputation: 71590
If all you know about something is that it's an iterator, then in general you must assume you can only iterate over it once. That doesn't meant that every iterator can only be consumed once, just that every iterator can be consumed at least once. The obvious example is that lists and other sequences support this interface.
As senderle and Amber have explained, the particular iterators you get by calling xrange
happen to be implemented such that you can iterate over them multiple times.
The general iterator idea allows that iterators may be exhausted after being iterated. This is because many iterators (such as generators, file traversal, etc) would be difficult to implement, or consume much more memory or run much slower, if they had to support arbitrarily many traversals, and very often this functionality wouldn't even be used. So if iterators had to support arbitrarily many traversals, then these things probably wouldn't be iterators.
Long story short, if you're writing code that operates on an arbitrary unknown iterator, you assume it can only be traversed once, and it doesn't matter if someone gives you an object that supports more than the functionality you need. If you know some additional information about the iterator (such as that it's also a sequence, or even as much as that it's an xrange object), then you can code to make use of that if you want.
Upvotes: 2
Reputation: 151147
Because xrange
does not return a generator. It returns an xrange object.
>>> type(xrange(10))
<type 'xrange'>
In addition to repeated iteration, xrange
objects support other things that generators don't -- like indexing:
>>> xrange(10)[5]
5
They also have a length:
>>> len(xrange(10))
10
And they can be reversed:
>>> list(reversed(xrange(10)))
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
In short, xrange
objects implement the full sequence interface:
>>> import collections
>>> isinstance(xrange(10), collections.Sequence)
True
They just do it without using up a lot of memory.
Note also that in Python 3, the range
object returned by range
has all the same properties.
Upvotes: 38
Reputation: 527328
Because the xrange
object produced by calling xrange()
specifies an __iter__
that provides a unique version of itself (actually, a separate rangeiterator
object) each time it's iterated.
>>> x = xrange(3)
>>> type(x)
<type 'xrange'>
>>> i = x.__iter__()
>>> type(i)
<type 'rangeiterator'>
Upvotes: 17