Merge several dictionaries creating array on different values

Question

So I have a list with several dictionaries, they all have the same keys. Some dictionaries are the same but one value is different. How could I merge them into 1 dictionary having that different values as array?

Let me give you an example:

let's say I have this dictionaries

[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]

My desired output would be this:

[{'a':1, 'b':2,'c':[3,4]},{'a':1, 'b':3,'c':[3,4]}]

I've tried using for and if nested, but it's too expensive and nasty, and I'm sure there must be a better way. Could you give me a hand?

How could I do that for any kind of dictionary assuming that the amount of keys is the same on the dictionaries and knowing the name of the key to be merged as array (c in this case)

thanks!

RoadRunner · Accepted Answer

Use a collections.defaultdict to group the c values by a and b tuple keys:

from collections import defaultdict

lst = [
    {"a": 1, "b": 2, "c": 3},
    {"a": 1, "b": 2, "c": 4},
    {"a": 1, "b": 3, "c": 3},
    {"a": 1, "b": 3, "c": 4},
]

d = defaultdict(list)
for x in lst:
    d[x["a"], x["b"]].append(x["c"])

result = [{"a": a, "b": b, "c": c} for (a, b), c in d.items()]

print(result)

Could also use itertools.groupby if lst is already ordered by a and b:

from itertools import groupby
from operator import itemgetter

lst = [
    {"a": 1, "b": 2, "c": 3},
    {"a": 1, "b": 2, "c": 4},
    {"a": 1, "b": 3, "c": 3},
    {"a": 1, "b": 3, "c": 4},
]

result = [
    {"a": a, "b": b, "c": [x["c"] for x in g]}
    for (a, b), g in groupby(lst, key=itemgetter("a", "b"))
]

print(result)

Or if lst is not ordered by a and b, we can sort by those two keys as well:

result = [
    {"a": a, "b": b, "c": [x["c"] for x in g]}
    for (a, b), g in groupby(
        sorted(lst, key=itemgetter("a", "b")), key=itemgetter("a", "b")
    )
]

print(result)

Output:

[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]

Update

For a more generic solution for any amount of keys:

def merge_lst_dicts(lst, keys, merge_key):
    groups = defaultdict(list)

    for item in lst:
        key = tuple(item.get(k) for k in keys)
        groups[key].append(item.get(merge_key))

    return [
        {**dict(zip(keys, group_key)), **{merge_key: merged_values}}
        for group_key, merged_values in groups.items()
    ]

print(merge_lst_dicts(lst, ["a", "b"], "c"))
# [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]

Merge several dictionaries creating array on different values

Answers (2)

Update

Related Questions