Mr.Robot
Mr.Robot

Reputation: 349

Return the all possible key combinations where the cumulative sum is equal to input value

For input data = {"a": 1, "b": 1, "c": 3}, I would like to obtain the dict that looks like

{1: [["a"], ["b"]], 
 2: [["a", "b"]], 
 3: [["c"]], 
 4: [["a", "c"], ["b", "c"]],
 5: [["a", "b", "c"]]}

where the key is all possible cumulative sum of the values of input dictionary and corresponding values are the list of all keys make the cumulative sum possible.

The difficulty here is that there could be multiple solutions to get the same cumulative sum (the above 1 is an example). Now I know I could do

{v: list(data.keys())[:idx+1] for v, idx in zip(np.cumsum(list(data.values())), range(len(data)))}

But this approach does not give complete solution I want. Concretely, for input data, I could only get

{1: ['a'], 
 2: ['a', 'b'], 
 5: ['a', 'b', 'c']}

And when there are more duplicates in data, the problem could become more difficult to resolve.

I am wondering if there is good solution to this problem (this problem seems to be NP-complete from my computation theory course).

Upvotes: 1

Views: 109

Answers (2)

Jasmijn
Jasmijn

Reputation: 10452

Your problem is related to the power set of data. To be specific, the power set excluding the empty set. We can use itertools from the standard library to help us out.

from collections import defaultdict
from itertools import chain, combinations

def nonempty_powerset(data):
    # adapted from the powerset recipe
    # at https://docs.python.org/3/library/itertools.html
    # start the range from 1 to exclude the empty set
    return chain.from_iterable(combinations(data, r) for r in range(1, len(data) + 1))

def get_subset_sums(data):
    output = defaultdict(list)
    for subset in nonempty_powerset(data):
        # subset is a tuple of strings, if you want lists we need to convert them
        output[sum(data[item] for item in subset)].append(list(subset))
    # convert to a regular dict
    return dict(output)

print(get_subset_sums({"a": 1, "b": 1, "c": 3}))

The runtime should be in O(N * 2 ** N) where N = len(data), which is pretty reasonable for something that involves power sets. Maybe there is a numpy method I missed that could help you with this but otherwise I think this is reasonably close to optimal.

Upvotes: 1

Ajax1234
Ajax1234

Reputation: 71461

You can use a recursive generator function to get all combinations of the dictionary keys, and the use collections.defaultdict to group the combinations on the sums they produce:

from collections import defaultdict
data = {"a": 1, "b": 1, "c": 3}
def to_sum(d, c = []):
   if c:
      yield c
   if len(d) > len(c):
      yield from [i for b in data for i in to_sum(d, c+[b]) if b not in c]

result = defaultdict(set)
for i in to_sum(data):
   result[sum(data[j] for j in i)].add(tuple(sorted(i)))

final_result = {a:list(map(list, b)) for a, b in result.items()}

Output:

{1: [['b'], ['a']], 2: [['a', 'b']], 5: [['a', 'b', 'c']], 4: [['b', 'c'], ['a', 'c']], 3: [['c']]}

Upvotes: 1

Related Questions