maruthi reddy
maruthi reddy

Reputation: 653

Python Groupby statement

I mam trying to group the following details list:

details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]

>>for k,v in itertools.groupby(details,key=operator.itemgetter(0)):
>>  print k,list(v)

And this is the output with the above groupby statement:

20130325 [('20130325', 'B')]

20130320 [('20130320', 'A')]

20130325 [('20130325', 'B')]

20130320 [('20130320', 'A')]

But my expected output was:

20130325 [('20130325', 'B'),('20130325', 'B')]

20130320 [('20130320', 'A'),('20130320', 'A')]

Am I doing wrong somewhere?

Upvotes: 3

Views: 572

Answers (2)

MRocklin
MRocklin

Reputation: 57251

The toolz project offers a non-streaming groupby

$ pip install toolz
$ ipython

In [1]: from toolz import groupby, first

In [2]: details = [('20130325','B'), ('20130320','A'), ('20130325','B'), ('20130320','A')]

In [3]: groupby(first, details)
Out[3]: 
{'20130320': [('20130320', 'A'), ('20130320', 'A')],
 '20130325': [('20130325', 'B'), ('20130325', 'B')]}

Upvotes: 1

Pavel Anossov
Pavel Anossov

Reputation: 62868

You have to sort your details first:

details.sort(key=operator.itemgetter(0))

or

fst = operator.itemgetter(0)
itertools.groupby(sorted(details, key=fst), key=fst)

 

Groupby groups consecutive matching records together.

Documentation:

The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

Upvotes: 7

Related Questions