Reputation: 653
I mam trying to group the following details list:
details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]
>>for k,v in itertools.groupby(details,key=operator.itemgetter(0)):
>> print k,list(v)
And this is the output with the above groupby statement:
20130325 [('20130325', 'B')]
20130320 [('20130320', 'A')]
20130325 [('20130325', 'B')]
20130320 [('20130320', 'A')]
But my expected output was:
20130325 [('20130325', 'B'),('20130325', 'B')]
20130320 [('20130320', 'A'),('20130320', 'A')]
Am I doing wrong somewhere?
Upvotes: 3
Views: 572
Reputation: 57251
The toolz
project offers a non-streaming groupby
$ pip install toolz
$ ipython
In [1]: from toolz import groupby, first
In [2]: details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]
In [3]: groupby(first, details)
Out[3]:
{'20130320': [('20130320', 'A'), ('20130320', 'A')],
'20130325': [('20130325', 'B'), ('20130325', 'B')]}
Upvotes: 1
Reputation: 62868
You have to sort your details first:
details.sort(key=operator.itemgetter(0))
or
fst = operator.itemgetter(0)
itertools.groupby(sorted(details, key=fst), key=fst)
Groupby groups consecutive matching records together.
The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.
Upvotes: 7