Roel Vlemmings
Roel Vlemmings

Reputation: 379

Algorithm: Optimal combination of values to stay within range

I have the following math problem, which I need in an application, and I am wondering if there is an efficient way to find the best solution as opposed to an approximation.

  1. I have a list of positive and negative values.
  2. The sum of these values is within a range (x, y).
  3. I want to know the maximum number of values that I can eliminate such that the SUM of the remaining values stays within the range.

Example:

Values: -10, -5, -2, 7, 9, 15
Sum: 14
Range: (10, 18)

Eliminate -2 => SUM = 16
Eliminate -5 => SUM = 21
Eliminate 7 => SUM = 14
Eliminate -10 => SUM = 24
Eliminate 9 => SUM = 15

Eliminating 15 would make SUM = 0, which is outside the range. 5 values eliminated.

Whereas if I start by eliminating 15, and then -10, -5, -2, I can only eliminate 4 values.

I once wrote an algorithm that simply tried all possible combinations, but the performance of that degrades rapidly, once you have 25 or more values. I need results in a tenth of a second for 100-200 values.

Currently, I sort the values from small to large on an absolute basis and then eliminate them one by one until the sum is no longer in the range. Obviously, that may not always give the optimal solution.

If this is not the right place for this type of question, and you can refer another forum, that would be of help as well.

Upvotes: 4

Views: 671

Answers (3)

vroomfondel
vroomfondel

Reputation: 3106

I'm tempted to do this backwards, which I'm not sure is allowed (see my comment.)

So instead of eliminating values one by one, let's find the smallest sub-list whose sum is within the range!

There is an issue -- the subset sum problem is np-complete, so this approach is too. (Imagine a situation in which your range is 0, and it is the same problem.)

There's a known algorithm that solves this problem in O(2N/2). I'll mock up some Python code, but in the mean time, the Wikipedia page should be helpful. Since you want to find the least list in a range, it's obviously going to need a bit of modifying.

Essentially, you split your list into two arbitrary sublists of length N/2 each (where your list has N elements.) Then generate all subsets in each list, and calculate their sums. (Here, I would store the subset and its sum in a dictionary, so you know which numbers you have left. Since you only want to find the smallest, I would also eliminate all subsets that have the same sum as a smaller one.) Sort these lists, and run through one forwards and one backwards until you find all of the sums that fit within the range. Finally, find out which contains the fewest elements, and you're good to go!

If you are allowed to make eliminations that violate the rule as long as the final list is within the range, check out this question

Edit: here's some Python. It is:

  • untested

  • Python, so not particularly fast

  • obviously not optimal

  • in dire need of refactoring

but I think as a general idea, the fastest algorithm that you're going to be able to get. I'd be interested to see a faster concept!

>>> from itertools import combinations, chain
>>> 
>>> available = [-10, -5, -2, 7, 9, 15]
>>> target = (10, 18)
>>> 
>>> 
>>> 
>>> def powerset(iterable): # from https://stackoverflow.com/questions/374626/how-can-i-find-all-the-subsets-of-a-set-with-exactly-n-elements
...     xs = list(iterable)
...     # note we return an iterator rather than a list
...     return chain.from_iterable(combinations(xs, n) for n in range(len(xs)+1))
... 
>>> 
>>> def getMinList(available, target):
...     middleIndex = len(available)/2
...     l1 = available[:middleIndex]
...     l2 = available[middleIndex:]
...     dict1 = {}
...     dict2 = {}
...     for subset in powerset(l1): # reverse so only the smallest subsets are used.
...         total = sum(subset)
...         if total not in dict1:
...             dict1[total] = subset
...     for subset in powerset(l2):
...         total = sum(subset)
...         if total not in dict2:
...             dict2[total] = subset
...     sortedDict1 = sorted(dict1.iteritems())
...     sortedDict2 = sorted(dict2.iteritems())
...     resultList = ()
...     minValues = middleIndex * 2
...     for k1, v1 in sortedDict1:
...         for k2, v2 in reversed(sortedDict2):
...             sumOfSubsets = k1 + k2
...             if sumOfSubsets <= target[1] and sumOfSubsets >= target[0]:
...                 newTuple = v1 + v2
...                 lenNewTuple = len(newTuple)
...                 if (lenNewTuple) < minValues:
...                     resultList = ((sumOfSubsets, newTuple))
...                     minValues = lenNewTuple
...     return resultList
... 
>>> getMinList(available, target)
(15, (15,))
>>> 
>>> target = (10, 10)
>>> 
>>> getMinList(available, target)
(10, (-5, 15))
>>> 
>>> target = (19, 22)
>>> 
>>> getMinList(available, target)
(22, (7, 15))

Upvotes: 3

user2665970
user2665970

Reputation: 11

The worse case, u need to check all the combinations, that is O(2^n). But if you start to check the smallest sublist, u could stop once you find one. Here is my c++ write. It could improve on memory use, but need more work. So depends your input, it could be very fast or slow.

using namespace std;

int compare(int x, int r1, int r2)
{`
    if (x < r1) return -1;
    if (x > r2) return 1;
    return 0;
}

// assume sorted v, binary search
bool hasMemInRange(const vector<int>& v, int r1, int r2)
{
    int b, e, c, r;

    b=0; e=v.size();
    while(e > b) {
        c = (b+e)/2;
        r = compare(v[c], r1, r2);
        if (r < 0) {
            b += max(1, (e-b)/2);
        } else if (r > 0) {
            e -= max(1, (e-b)/2);
        } else {
            return true;
        }
    }
    return false;
}
struct InputNode {
    vector<int> l;
    int         r1, r2;
};

// assume v is sorted
int maxRemoval(const vector<int>& v, int r1, int r2)
{
    if (compare(0, r1, r2) == 0) return v.size();

    if (hasMemInRange(v, r1, r2)) return (v.size() - 1);

    queue<InputNode> q;
    InputNode node;

    node.l = v;
    node.r1 = r1;
    node.r2 = r2;
    q.push(node);

    while(! q.empty()) {
        InputNode& n = q.front();

        if (n.l.size() == 1) {
            return 0;
        }

        for (int i=0; i<n.l.size(); ++i) {
            vector<int> nv = n.l;
            nv.erase(nv.begin() + i);
            node.l = nv;
            node.r1 = r1-n.l[i];
            if (hasMemInRange(nv, node.r1, node.r2)) {
                return (nv.size() - 1);
            }
            q.push(node);
        }
        q.pop();
    }
}

int list_ints[] = {-10, -5, -2, 7, 9, 15 };

int main()
{
    vector<int> l(list_ints, list_ints + sizeof(list_ints)/sizeof(int));

    for (auto i: list_ints) cout << i << "    ";
    cout << endl << endl;

    cout << maxRemoval(l, 10, 18) << endl;
    return 0;
}

Upvotes: 1

Thomas B.
Thomas B.

Reputation: 2296

Using dynamic programming (implemented via memoization) you could use the following:

class Memoize:
    def __init__(self, f):
        self.f = f
        self.memo = {}
    def __call__(self, *args):
        if not args in self.memo:
            self.memo[args] = self.f(*args)
        return self.memo[args]        

def maxsubset(values, min_sum, max_sum):
    target_range = range(min_sum, max_sum+1)

    @Memoize
    def maxsubsetsize(target_sum, current_value_index=len(values)-1):
        if current_value_index < 0:
            if target_sum == 0:
                return 0
            else:
                return float("-inf")

        withit = maxsubsetsize(target_sum - values[current_value_index], current_value_index-1) + 1
        without = maxsubsetsize(target_sum, current_value_index-1)
        return max(withit, without)

    result_sum = max(target_range, key=maxsubsetsize)
    setsize = maxsubsetsize(result_sum)

    result = []
    for i in reversed([x-1 for x in xrange(len(values))]):
        s = maxsubsetsize(result_sum, i)
        if s < setsize:
            result.append(values[i+1])
            setsize -= 1
            result_sum -= values[i+1]

    return result

Usage:

>>> values = [-10, -5, -2, 7, 9, 15]
>>> min_sum = 10
>>> max_sum = 18

>>> xs = maxsubset(values, min_sum-sum(values), max_sum-sum(values))
>>> print xs
[9, 7, -2, -5, -10]
>>> print "sum:", sum(xs)
-1

You could add an additional check if a specific sum is reachable at all. All available negative values give a lower bound on the sum and all available positive numbers give an upper bound.

Upvotes: 1

Related Questions