Reputation: 5804
I'd like a function that can group a list of dictionaries into sublists of dictionaries depending on an arbitrary set of keys that all dictionaries have in common.
For example, I'd like the following list to be grouped into sublists of dictionaries depending on a certain set of keys
l = [{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100},{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100},{'name':'g','type':'normal','color':'red','amount':100}]
If I wanted to group by type, the following list would result, which has a sublists where each sublist has the same type:
[[{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
If I wanted to group by type and color, the following would result where the list contains sublists that have the same type and color:
[[{'name':'b','type':'new','color':'blue','amount':100}],[{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100}],[{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
I understand the following function can group by one key, but I'd like to group by multiple keys:
def group_by_key(l,i):
l = [list(grp) for key, grp in itertools.groupby(sorted(l, key=operator.itemgetter(i)), key=operator.itemgetter(i))]
This is my attempt using the group_by_function above
def group_by_multiple_keys(l,*keys):
for key in keys:
l = group_by_key(l,key)
l = [item for sublist in l for item in sublist]
return l
The issue there is that it ungroups it right after it grouped it by a key. Instead, I'd like to re-group it by another key and still have one list of sublists.
Upvotes: 2
Views: 2297
Reputation: 12002
itertools.groupby()
+ operator.itemgetter()
will do what you want. groupby()
takes an iterable and a key function, and groups the items in the iterable by the value returned by passing each item to the key function. itemgetter()
is a factory that returns a function, which gets the specified items from any item passed to it.
from __future__ import print_function
import pprint
from itertools import groupby
from operator import itemgetter
def group_by_keys(iterable, keys):
key_func = itemgetter(*keys)
# For groupby() to do what we want, the iterable needs to be sorted
# by the same key function that we're grouping by.
sorted_iterable = sorted(iterable, key=key_func)
return [list(group) for key, group in groupby(sorted_iterable, key_func)]
dicts = [
{'name': 'b', 'type': 'new', 'color': 'blue', 'amount': 100},
{'name': 'c', 'type': 'new', 'color': 'red', 'amount': 100},
{'name': 'd', 'type': 'old', 'color': 'gold', 'amount': 100},
{'name': 'e', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'f', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'g', 'type': 'normal', 'color': 'red', 'amount': 100}
]
Examples:
>>> pprint.pprint(group_by_keys(dicts, ('type',)))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'},
{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]
>>>
>>> pprint.pprint(group_by_keys(dicts, ('type', 'color')))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'}],
[{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]
Upvotes: 3