Reputation: 23
How to partition an integer array into N subsets such that the sum of these subsets is minimum?
For example the array is consisted of 11 elements and I need 6 subsets out of it.
{2,1,1,3,4,4,3,2,1,2,3}
The subsets : {2,1,1,3}, {4}, {4,3}, {3,2}, {1,2}, {3}
Minimum sum=7.
an alternative answer: {2,1,1} {3,4} {4} {3,2} {1,2} {3}
Minimum sum=7.
Note: The order in which the numbers appear in the original set, must be maintained while partitioning.
Upvotes: 0
Views: 1263
Reputation: 3038
One possible approach is to binary search for the answer.
We need a procedure that would check whether we can partition the set using only sums equal or below a parameter, S
. Let's call this procedure onlySumsBelow(S)
.
We can use a greedy solution to implement onlySumsBelow(S)
. Always add as many elements as possible in each subset, and stop just before reaching a sum larger than S (I am assuming here that we don't have negative elements, which may complicate the discussion). If we cannot reach the end of the sequence using the allowed number of subsets, then the sum is not valid (it is too small).
function onlySumsBelow(S) {
partitionsUsed = 1;
currentSum = 0;
for each value in sequence {
if (value > S) return false;
if (currentSum + value > S) {
// start a new partition
currentSum = value;
partitionsUsed++;
} else {
currentSum += value;
}
}
return partitionsUsed <= N;
}
Once we have the onlySumsBelow(S)
procedure, we can binary search for the answer, starting with an interval that at the left end has a value that ensures that the searched answer is not below (e.g. 0) and at the right end has a large enough number that ensures that the searched answer is not above (e.g. the sum of all numbers in the sequence).
If the efficiency is not a concern, instead of binary searching you can simply try multiple candidate answers one by one, starting from a small enough value, e.g. sum of all numbers divided by N, and then increasing by one until reaching a fine solution.
Remark: without the note in the end of the question (that restricts us to taking into account subsets of numbers that appear at neighboring positions in the input) the problem is NP-complete, since it is a generalization of the Partition problem, which only uses two sets.
Upvotes: 1