Reputation: 2147
I am struggling to find a convincing pythonic way to do group by on a list of dict, below seems to be having good readability but not necessary the most efficient way: I have to sort first ( the prerequisite for groupby) and then do the group by (another questiona mark here for the how groupby in itertools is implemented..).
One obvious alternative is to use collections.defaultdict but I will have to do a lot of list.append (and less pythonic?). which one you guy think it a better option? or there is other better way to do group by? thanks
from itertools import groupby
from operator import itemgetter
data = [ {'x':1, 'y':1},
{'x':2, 'y':2},
{'x':3, 'y':2},
{'x':4, 'y':1}, ]
sortedData = sorted(data, key=itemgetter('y'))
for y, d in groupby( sortedData, itemgetter('y')):
print y, list(d)
1 [{'y': 1, 'x': 1}, {'y': 1, 'x': 4}]
2 [{'y': 2, 'x': 2}, {'y': 2, 'x': 3}]
Upvotes: 1
Views: 205
Reputation: 80386
as you already know defaultdict
is one alternative. I am not sure about the "pythonicness", but it seems to be about twice as fast(as you asked about efficiency):
from collections import defaultdict
def f(l):
d = defaultdict(list)
for i in data:
d[i.get('y')].append(i)
return d
%timeit f(data)
100000 loops, best of 3: 3.7 us per loop
%timeit {y:list(d) for y, d in groupby(sorted(data, key=itemgetter('y')),
itemgetter('y'))}
100000 loops, best of 3: 8.33 us per loop
Upvotes: 1
Reputation: 89221
To group an unordered list, you will need to examine each object in the list, and place it into a group:
def groupby(iterable, keyfunc=id):
result = []
groups = {}
for item in iterable:
key = keyfunc(item)
group = groups.get(key)
if group is None:
groups[key] = group = []
result.append((key,group))
group.append(item)
return result
Upvotes: 1