Reputation: 759
Input an integer array A
, where 1<=A[i]<=1000000
.
Find the minimum size(cardinality) of a positive integer multiset S
( just like coins ), such that for every A[i]
, there exist a subset of S
whose sum equals A[i]
.
Input A = [1,27,28]
.
Output 2
, one of the minimum multisets is {1,27}
.
Input A=[18,37,42,50]
Output 3
, the only minimum multisets is {5,13,37}
This test case shows that the min(A)
don't appear in S
, and the answer > ceil(log2(len(A)))
.
Even if len(A)<=10
, I can't find any correct and determined algorithm. Is there any solution that works when len(A)
is about 10?
I can only find an algorithm that works whenlen(A)<=6
:
For example, when len(A)==6
, remove the duplicates from A
. Suppose the size of S
is 5
, then enumerate all possible coefficients c00,c01,c02,c03,c04,c10,...
(cij∈{0,1}) in time complexity O(perm(32, 5)), such that ∑(cij*S[j])==A[i]
(0<=i<5). Solve the five linear equations using Gauss elimination in O(5^3), then additionally check A[5]
using a knapsack algorithm. If there isn't a solution, the answer is 6, otherwise the size of S
is at most 5. Repeating the previous process again, by assuming the size of S
is 4. Until finding the correct solution.
Do such problems ( hard to write use brute force ) have some formal terminology?
Upvotes: 14
Views: 648
Reputation: 1834
In the following answer I will consider arrays A
where all the values are strictly positive and that the values in the array are unique. First I would like to start by stating the relatively obvious.
A
has 2 numbers, the smallest set of numbers is two (the set A
itself)A
has 3 numbers, the smallest set of numbers will be 2 iff the sum of the smallest numbers equal to the third, otherwise it will be 3 (the set A
itself)After noting this I would like to look at the problem from the opposite direction, which will give us insights regarding the solution to our problem. Given a set of n
numbers, a total of 2**n - 1
different numbers can be put together since each number in the set can be individually present or not in the summation.
Given a set of three numbers {a,b,c}
a total of 2**3 - 1 = 7
numbers can be generated:
[a, b, c, a+b, a+c, b+c, a+b+c]
Therefore, we can derive a lower bound on the smallest set number given the length of A
, noted as m
:
Lower_bound = ceil(log2(m+1))
This means that the lower bound for A
of length 4 to 7 is 3. The bound for lengths 8 to 15 is 4 etc.
This will present a suggestion to a solution, it might has some flaws in it but it covered most of what I could come up with.
Since we now have a bound, if we find a set with length of the bound which solves our problem it is guaranteed to be an optimal and the result should be the bound. I would suggest the following solution
for ss in range(3, m)
1.1. get all combinations between 2
and ss-1
, check if the sum of each combination is in the array. If so, remove the sum from the array
1.2. reduce from each remaining value in A
all the rest of the remaining values, creating a symmetric rectangular matrix where the diagonal is 0
1.3. for each possible combination of two positive values in the formed matrix check if:
1.3.1. the sum of the combination is in A
1.3.2. A number in A can be decomposed into two numbers with which it is possible to add with other members of A
. If so, remove the three numbers from A
and add the two decomposed numbers instead
A python implementation of this is as follows:
from itertools import combinations
import numpy as np
def get_subset_count(A):
m = len(A)
if m == 2:
return 2
elif m == 3:
return 3 if A[0] + A[1] == A[2] else 2
else:
lower_bound = int(np.ceil(np.log2(m + 1)))
for ss in range(lower_bound, m): # checking if there is a subset with size of ss
A_temp = A.copy()
removed = set()
sub_values = []
# phase 1 - looking if a summation of components in A result in other components in A
for kk in range(2, ss+1):
for comb in combinations(A[:ss+1, kk):
if sum(comb) in A_temp:
A_temp.remove(sum(comb))
removed.add(sum(comb))
if len(A_temp) <= ss: return ss
# phase 2 - looking for hidden base components
deduction = np.array(A_temp)[:, None] - np.array(A_temp)
for comb in combinations(deduction[deduction > 0].tolist(), 2):
if sum(comb) in A_temp: # checking if the sum is in the array
A_temp2 = A_temp.copy()
A_temp2.remove(sum(comb))
comb0_logic = [comb[0] + a in A_temp2 for a in A_temp2]
comb1_logic = [comb[1] + a in A_temp2 for a in A_temp2]
if any(comb0_logic) and any(comb1_logic): # checking if combination is a hidden base
A_temp.remove(sum(comb))
removed.add(sum(comb))
A_temp.remove(sum(comb[0] + np.array(A_temp2)[comb0_logic]))
removed.add(sum(comb[0] + np.array(A_temp2)[comb0_logic]))
sub_values = [comb[0]] + sub_values
if comb[0] + np.array(A_temp2)[comb0_logic] != comb[1] + np.array(A_temp2)[comb1_logic]:
A_temp.remove(sum(comb[1] + np.array(A_temp2)[comb1_logic]))
removed.add(sum(comb[1] + np.array(A_temp2)[comb1_logic]))
sub_values = [comb[1]] + sub_values
if len(A_temp) + len(sub_values) <= ss: return ss
return len(A)
When running this method with the following:
A = [4,5,7,16]
print(get_subset_count(A)) # 3 - 4,5,7
A = [9,11,12,16]
print(get_subset_count(A)) # 3 - 4,5,7
A = [4,5,7,9,11,12,16]
print(get_subset_count(A)) # 3 - 4,5,7
A = [4,5,7,9,11,12,17]
print(get_subset_count(A)) # 4 - 4,5,7,17
A = [18,20,37,42,50]
print(get_subset_count(A)) # 4 - 5,13,20,37
A = [18,20,37,20,42,50,55]
print(get_subset_count(A)) # 4 - 5,13,20,37
A = [18,37,20,42,50,54]
print(get_subset_count(A)) # 5 - 18, 37, 20, 24, 30
A = [1,2,3,4,5,6,7,8,9,10]
print(get_subset_count(A)) # 4 - 1,2,3,4 or 1,2,4,8
Regarding runtime, this is NP in my opinion due to the combinations
nature.
This method still fails to discover some of the samples, I'll explain with an example:
for A = [4,13,21,37,45,62]
exists a lower bound set which is [4,8,13,37]
. Denoting the set members as [a,b,c,d]
respectively, we can describe A
as
A = [a, c, c+b, d, d+b, a+b+c+d]
My suggested method will discover that 62 = 4+21+37
and will remove it, then it will not discover that 45 - 37 = 21 - 13 = 8
After looking for a solution to these cases as well, I came up with the following:
def get_subset_count(A):
m = len(A)
if m == 2:
return 2
elif m == 3:
return 3 if A[0] + A[1] == A[2] else 2
else:
lower_bound = int(np.ceil(np.log2(m + 1)))
for ss in range(lower_bound, m): # checking if there is a subset with size of ss
A_temp = A.copy()
removed = set()
sub_values = []
# phase 1 - looking if a summation of components in A result in other components in A
for kk in range(2, ss+1):
for comb in combinations(A[:ss+1], kk):
if sum(comb) in A_temp:
A_temp.remove(sum(comb))
removed.add(sum(comb))
if len(A_temp) <= ss: return ss
# phase 2 - looking for hidden base components
deduction = np.array(A_temp)[:, None] - np.array(A_temp)
for comb in combinations(deduction[deduction > 0].tolist(), 2):
if sum(comb) in A_temp: # checking if the sum is in the array
A_temp2 = A_temp.copy()
A_temp2.remove(sum(comb))
comb0_logic = [comb[0] + a in A_temp2 for a in A_temp2]
comb0_logic_sub = [comb[0] + a in A_temp2 for a in sub_values]
comb1_logic = [comb[1] + a in A_temp2 for a in A_temp2]
comb1_logic_sub = [comb[1] + a in A_temp2 for a in sub_values]
if any(comb0_logic) and any(comb1_logic): # checking if combination is a hidden base
A_temp.remove(sum(comb))
removed.add(sum(comb))
A_temp.remove(sum(comb[0] + np.array(A_temp2)[comb0_logic]))
removed.add(sum(comb[0] + np.array(A_temp2)[comb0_logic]))
if comb[0] not in sub_values: sub_values.append(comb[0]) # if this sub value is not in the list, adding it
if comb[0] + np.array(A_temp2)[comb0_logic] != comb[1] + np.array(A_temp2)[comb1_logic]:
A_temp.remove(sum(comb[1] + np.array(A_temp2)[comb1_logic]))
removed.add(sum(comb[1] + np.array(A_temp2)[comb1_logic]))
if comb[1] not in sub_values: sub_values.append(comb[1]) # if this sub value is not in the list, adding it
elif (comb[0] in sub_values) or (comb[1] in sub_values):
if any(comb0_logic_sub):
A_temp.remove(sum(comb[0] + np.array(sub_values)[comb0_logic_sub]))
removed.add(sum(comb[0] + np.array(sub_values)[comb0_logic_sub]))
if comb[0] not in sub_values: sub_values.append(comb[0]) # if this sub value is not in the list, adding it
if any(comb1_logic_sub):
A_temp.remove(sum(comb[1] + np.array(sub_values)[comb1_logic_sub]))
removed.add(sum(comb[1] + np.array(sub_values)[comb1_logic_sub]))
if comb[1] not in sub_values: sub_values.append(comb[1]) # if this sub value is not in the list, adding it
elif comb[0] == comb[1]: # cases where the hidden base is not in the set but the sum of the hidden base with other parts are in the set
A_temp2 = A_temp.copy()
comb0_logic = [comb[0] + a in A_temp2 for a in A_temp2]
if any(comb0_logic):
removed_numbers = comb[0] + np.array(A_temp2)[comb0_logic]
for removed_number in removed_numbers:
A_temp.remove(removed_number)
removed.add(removed_number)
if comb[0] not in sub_values: sub_values.append(comb[0])
elif comb[0] in sub_values and comb[1] in sub_values and sum(comb) in A_temp: # cases where we found the base already and we need to remove a single number
A_temp.remove(sum(comb))
removed.add(sum(comb))
# phase 3 - check if we did not forget any other number due to the order of combinations
for kk in range(2, ss+1):
for comb in combinations(sub_values, kk):
if sum(comb) in A_temp:
A_temp.remove(sum(comb))
if len(A_temp) + len(sub_values) <= ss: return ss
return len(A)
Upvotes: 2