Reputation: 2147

pythonic way to do groupby on list of dict (and efficient?)

I am struggling to find a convincing pythonic way to do group by on a list of dict, below seems to be having good readability but not necessary the most efficient way: I have to sort first ( the prerequisite for groupby) and then do the group by (another questiona mark here for the how groupby in itertools is implemented..).

One obvious alternative is to use collections.defaultdict but I will have to do a lot of list.append (and less pythonic?). which one you guy think it a better option? or there is other better way to do group by? thanks

from itertools import groupby
from operator import itemgetter

data = [ {'x':1, 'y':1},
         {'x':2, 'y':2},
         {'x':3, 'y':2},
         {'x':4, 'y':1}, ]

sortedData = sorted(data, key=itemgetter('y'))

for y, d in groupby( sortedData, itemgetter('y')):
    print y, list(d)

1 [{'y': 1, 'x': 1}, {'y': 1, 'x': 4}]
2 [{'y': 2, 'x': 2}, {'y': 2, 'x': 3}]

Upvotes: 1

Answers (2)

root

Reputation: 80386

as you already know defaultdict is one alternative. I am not sure about the "pythonicness", but it seems to be about twice as fast(as you asked about efficiency):

from collections import defaultdict
def f(l):
    d = defaultdict(list)
    for i in data: 
        d[i.get('y')].append(i)
    return d

%timeit f(data)
100000 loops, best of 3: 3.7 us per loop

%timeit {y:list(d) for y, d in groupby(sorted(data, key=itemgetter('y')),
                                                        itemgetter('y'))}
100000 loops, best of 3: 8.33 us per loop

Upvotes: 1

Markus Jarderot

Reputation: 89221

To group an unordered list, you will need to examine each object in the list, and place it into a group:

def groupby(iterable, keyfunc=id):
    result = []
    groups = {}
    for item in iterable:
        key = keyfunc(item)
        group = groups.get(key)
        if group is None:
            groups[key] = group = []
            result.append((key,group))
        group.append(item)
    return result

Upvotes: 1

pythonic way to do groupby on list of dict (and efficient?)

Answers (2)

Related Questions