Reputation: 3799

nicer way to merge list of dictionaries by key

I have a list of dictionaries and a function that can extract a value from each of those dictionaries in the list. The goal is that i get a dictionary where the keys are the values that are returned by the given function when i pass it the dictionaries from the given list of dictionaries. The according values in the returned dictionary should be the subset of dictionaries from the original list of dictionaries for which the given function returned the according key.

I know this explanation is very confusing, so I'm showing it in an implementation:

keygen = lambda x: x['key']

data = [{'key': 'key1',
         'data': 'value2'},
        {'key': 'key3',
         'data': 'value2'},
        {'key': 'key2',
         'data': 'value2'},
        {'key': 'key2',
         'data': 'value2'},
        {'key': 'key1',
         'data': 'value2'}]

def merge_by_keygen(data, keygen):
    return_value = {} 
    for dataset in data:
        if keygen(dataset) not in return_value.keys():
            return_value[keygen(dataset)] = [] 
        return_value[keygen(dataset)].append(dataset)
    return return_value

merge_by_keygen(data, keygen)

returns:

{'key3': [{'data': 'value2', 'key': 'key3'}], 
 'key2': [{'data': 'value2', 'key': 'key2'}, {'data': 'value2', 'key': 'key2'}], 
 'key1': [{'data': 'value2', 'key': 'key1'}, {'data': 'value2', 'key': 'key1'}]}

I'm looking for a nicer and more compact implementation of the same logic, like some dictionary/list comprehensions. Thanks!

Upvotes: 2

Answers (4)

TessellatingHeckler

Reputation: 28983

I think this does it

return_value = {}
for d in data:
    return_value.setdefault(keygen(d), []).append(d)

You can write it in a list comprehension, but it's ugly to use the side effects of a list comprehension to affect data and then build up a list of None results and throw it away...

r = {}
[r.setdefault(keygen(d), []).append(d) for d in data]

The core of your function all mashes down into the dictionary setdefault method. All three lines about calling the keygen, checking if the key is in the return dictionary, if it's not create an empty list, store the empty list in the dictionary, then get query the dictionary again to get the list ready to append to it - all done by setdefault().

Upvotes: 0

eriknw

Reputation: 286

If you don't mind using a third-party package, this is easily done with toolz.groupby:

>>> import toolz
>>> toolz.groupby(keygen, data)
{'key1': [{'data': 'value2', 'key': 'key1'},
          {'data': 'value2', 'key': 'key1'}],
 'key2': [{'data': 'value2', 'key': 'key2'},
          {'data': 'value2', 'key': 'key2'}],
 'key3': [{'data': 'value2', 'key': 'key3'}]}

The same result is also obtained with toolz.groupby('key', data)

Upvotes: 2

Abhijit

Reputation: 63727

This is an ideal problem to be handled by itertools.groupby

Implementation

from itertools import groupby
from operator import itemgetter
groups = groupby(sorted(data, key = itemgetter('key')), key = itemgetter('key'))
data_dict = {k : list(g) for k, g in groups}

or if you prefer one-liner

data_dict = {k : list(g) 
             for k, g in groupby(sorted(data, 
                                        key = itemgetter('key')), 
                                 key = itemgetter('key'))}

Output

{'key1': [{'data': 'value2', 'key': 'key1'},
          {'data': 'value2', 'key': 'key1'}],
 'key2': [{'data': 'value2', 'key': 'key2'},
          {'data': 'value2', 'key': 'key2'}],
 'key3': [{'data': 'value2', 'key': 'key3'}]}

Upvotes: 5

Peter DeGlopper

Reputation: 37319

I don't think this is amenable to a comprehension, but you can make it tidier using a collections.defaultdict(list) instance:

import collections

def merge_by_keygen(data, keygen):
    return_value = collections.defaultdict(list)
    for dataset in data:
        key = keygen(dataset)
        return_value[key].append(dataset)
    return return_value

That looks pretty clean to me - you could mess around with ways to move where you call the keygen function if you like but I think you'd probably lose clarity.

Upvotes: 1

nicer way to merge list of dictionaries by key

Answers (4)

Related Questions