Reputation: 2758
I would have expected these two pieces of code to produce the same results
from itertools import groupby
for i in list(groupby('aaaabb')):
print i[0], list(i[1])
for i, j in groupby('aaaabb'):
print i, list(j)
In one I convert the iterator returned by groupby to a list and iterate over that, and in the other I iterate over the returned iterator directly.
The output of this script is
a []
b ['b']
a ['a', 'a', 'a', 'a']
b ['b', 'b']
Why is this the case?
Edit: for reference, the result of groupby('aabbaa')
looks like
('a', <itertools._grouper object at 0x10c1324d0>)
('b', <itertools._grouper object at 0x10c132250>)
Upvotes: 3
Views: 76
Reputation: 213807
This is a quirk of the groupby
function, presumably for performance.
From the itertools.groupby
documentation:
The returned group is itself an iterator that shares the underlying iterable with
groupby()
. Because the source is shared, when thegroupby()
object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:groups = [] uniquekeys = [] data = sorted(data, key=keyfunc) for k, g in groupby(data, keyfunc): groups.append(list(g)) # Store group iterator as a list uniquekeys.append(k)
So, you can do this:
for i in [x, list(y) for x, y in groupby('aabbaa')]:
print i[0], i[1]
Upvotes: 5