Reputation: 349
For input data = {"a": 1, "b": 1, "c": 3}
, I would like to obtain the dict that looks like
{1: [["a"], ["b"]],
2: [["a", "b"]],
3: [["c"]],
4: [["a", "c"], ["b", "c"]],
5: [["a", "b", "c"]]}
where the key is all possible cumulative sum of the values of input dictionary and corresponding values are the list of all keys make the cumulative sum possible.
The difficulty here is that there could be multiple solutions to get the same cumulative sum (the above 1
is an example). Now I know I could do
{v: list(data.keys())[:idx+1] for v, idx in zip(np.cumsum(list(data.values())), range(len(data)))}
But this approach does not give complete solution I want. Concretely, for input data
, I could only get
{1: ['a'],
2: ['a', 'b'],
5: ['a', 'b', 'c']}
And when there are more duplicates in data
, the problem could become more difficult to resolve.
I am wondering if there is good solution to this problem (this problem seems to be NP-complete from my computation theory course).
Upvotes: 1
Views: 109
Reputation: 10452
Your problem is related to the power set of data
. To be specific, the power set excluding the empty set. We can use itertools
from the standard library to help us out.
from collections import defaultdict
from itertools import chain, combinations
def nonempty_powerset(data):
# adapted from the powerset recipe
# at https://docs.python.org/3/library/itertools.html
# start the range from 1 to exclude the empty set
return chain.from_iterable(combinations(data, r) for r in range(1, len(data) + 1))
def get_subset_sums(data):
output = defaultdict(list)
for subset in nonempty_powerset(data):
# subset is a tuple of strings, if you want lists we need to convert them
output[sum(data[item] for item in subset)].append(list(subset))
# convert to a regular dict
return dict(output)
print(get_subset_sums({"a": 1, "b": 1, "c": 3}))
The runtime should be in O(N * 2 ** N)
where N = len(data)
, which is pretty reasonable for something that involves power sets. Maybe there is a numpy method I missed that could help you with this but otherwise I think this is reasonably close to optimal.
Upvotes: 1
Reputation: 71461
You can use a recursive generator function to get all combinations of the dictionary keys, and the use collections.defaultdict
to group the combinations on the sums they produce:
from collections import defaultdict
data = {"a": 1, "b": 1, "c": 3}
def to_sum(d, c = []):
if c:
yield c
if len(d) > len(c):
yield from [i for b in data for i in to_sum(d, c+[b]) if b not in c]
result = defaultdict(set)
for i in to_sum(data):
result[sum(data[j] for j in i)].add(tuple(sorted(i)))
final_result = {a:list(map(list, b)) for a, b in result.items()}
Output:
{1: [['b'], ['a']], 2: [['a', 'b']], 5: [['a', 'b', 'c']], 4: [['b', 'c'], ['a', 'c']], 3: [['c']]}
Upvotes: 1