theanine
theanine

Reputation: 1008

Sum tuples of cartesian product of arbitrary number of dicts

I'd like to do the cartesian product of multiple dicts, based on their keys, and then sum the produced tuples, and return that as a dict. Keys that don't exist in one dict should be ignored (this constraint is ideal, but not necessary; i.e. you may assume all keys exist in all dicts if needed). Below is basically what I'm trying to achieve (example shown with two dicts). Is there a simpler way to do this, and with N dicts?

def doProdSum(inp1, inp2):
    prod = defaultdict(lambda: 0)
    for key in set(list(inp1.keys())+list(inp2.keys())):
        if key not in prod:
            prod[key] = []
        if key not in inp1 or key not in inp2:
            prod[key] = inp1[key] if key in inp1 else inp2[key]
            continue
        for values in itertools.product(inp1[key], inp2[key]):
            prod[key].append(values[0] + values[1])
    return prod

x = doProdSum({"a":[0,1,2],"b":[10],"c":[1,2,3,4]}, {"a":[1,1,1],"b":[1,2,3,4,5]})
print(x)

Output (as expected):

{'c': [1, 2, 3, 4], 'b': [11, 12, 13, 14, 15], 'a': [1, 1, 1, 2, 2, 2, 3, 3, 3]}

Upvotes: 1

Views: 64

Answers (1)

Thierry Lathuille
Thierry Lathuille

Reputation: 24233

You can do it like this, by first reorganizing your data by key:

from collections import defaultdict
from itertools import product


def doProdSum(list_of_dicts):
    # We reorganize the data by key
    lists_by_key = defaultdict(list)
    for d in list_of_dicts:
        for k, v in d.items():
            lists_by_key[k].append(v)

    # list_by_key looks like {'a': [[0, 1, 2], [1, 1, 1]], 'b': [[10], [1, 2, 3, 4, 5]],'c': [[1, 2, 3, 4]]}

    # Then we generate the output
    out = {}
    for key, lists in lists_by_key.items():
        out[key] = [sum(prod) for prod in product(*lists)]

    return out

Example output:

list_of_dicts = [{"a":[0,1,2],"b":[10],"c":[1,2,3,4]}, {"a":[1,1,1],"b":[1,2,3,4,5]}]
doProdSum(list_of_dicts)

# {'a': [1, 1, 1, 2, 2, 2, 3, 3, 3],
#  'b': [11, 12, 13, 14, 15],
#  'c': [1, 2, 3, 4]}

Upvotes: 1

Related Questions