user119249
user119249

Reputation: 127

Minimum sum that cant be obtained from a set

Given a set S of positive integers whose elements need not to be distinct i need to find minimal non-negative sum that cant be obtained from any subset of the given set.

Example : if S = {1, 1, 3, 7}, we can get 0 as (S' = {}), 1 as (S' = {1}), 2 as (S' = {1, 1}), 3 as (S' = {3}), 4 as (S' = {1, 3}), 5 as (S' = {1, 1, 3}), but we can't get 6.

Now we are given one array A, consisting of N positive integers. Their are M queries,each consist of two integers Li and Ri describe i'th query: we need to find this Sum that cant be obtained from array elements ={A[Li], A[Li+1], ..., A[Ri-1], A[Ri]} .

I know to find it by a brute force approach to be done in O(2^n). But given 1 ≤ N, M ≤ 100,000.This cant be done . So is their any effective approach to do it.

Upvotes: 0

Views: 4017

Answers (2)

neeKo
neeKo

Reputation: 4280

Concept

Suppose we had an array of bool representing which numbers so far haven't been found (by way of summing).

numbers array

For each number n we encounter in the ordered (increasing values) subset of S, we do the following:

  1. For each existing True value at position i in numbers, we set numbers[i + n] to True
  2. We set numbers[n] to True

With this sort of a sieve, we would mark all the found numbers as True, and iterating through the array when the algorithm finishes would find us the minimum unobtainable sum.


Refinement

Obviously, we can't have a solution like this because the array would have to be infinite in order to work for all sets of numbers.

The concept could be improved by making a few observations. With an input of 1, 1, 3, the array becomes (in sequence):

enter image description here(numbers represent true values) enter image description here numbers array 2

An important observation can be made:

  • (3) For each next number, if the previous numbers had already been found it will be added to all those numbers. This implies that if there were no gaps before a number, there will be no gaps after that number has been processed.

For the next input of 7 we can assert that:

  • (4) Since the input set is ordered, there will be no number less than 7
  • (5) If there is no number less than 7, then 6 cannot be obtained

enter image description here

We can come to a conclusion that:

  • (6) the first gap represents the minimum unobtainable number.

Algorithm

Because of (3) and (6), we don't actually need the numbers array, we only need a single value, max to represent the maximum number found so far.

This way, if the next number n is greater than max + 1, then a gap would have been made, and max + 1 is the minimum unobtainable number.

Otherwise, max becomes max + n. If we've run through the entire S, the result is max + 1.

Actual code (C#, easily converted to C):

static int Calculate(int[] S)
{
    int max = 0;
    for (int i = 0; i < S.Length; i++)
    {
        if (S[i] <= max + 1)
            max = max + S[i];
        else
            return max + 1;
    }
    return max + 1;
}

Should run pretty fast, since it's obviously linear time (O(n)). Since the input to the function should be sorted, with quicksort this would become O(nlogn). I've managed to get results M = N = 100000 on 8 cores in just under 5 minutes.

With numbers upper limit of 10^9, a radix sort could be used to approximate O(n) time for the sorting, however this would still be way over 2 seconds because of the sheer amount of sorts required.

But, we can use statistical probability of 1 being randomed to eliminate subsets before sorting. On the start, check if 1 exists in S, if not then every query's result is 1 because it cannot be obtained.

Statistically, if we random from 10^9 numbers 10^5 times, we have 99.9% chance of not getting a single 1.

Before each sort, check if that subset contains 1, if not then its result is one.

With this modification, the code runs in 2 miliseconds on my machine. Here's that code on http://pastebin.com/rF6VddTx

Upvotes: 15

amit
amit

Reputation: 178421

This is a variation of the subset-sum problem, which is NP-Complete, but there is a pseudo-polynomial Dynamic Programming solution you can adopt here, based on the recursive formula:

f(S,i) = f(S-arr[i],i-1) OR f(S,i-1)
f(-n,i) = false
f(_,-n) = false
f(0,i) = true

The recursive formula is basically an exhaustive search, each sum can be achieved if you can get it with element i OR without element i.

The dynamic programming is achieved by building a SUM+1 x n+1 table (where SUM is the sum of all elements, and n is the number of elements), and building it bottom-up.

Something like:

table <- SUM+1 x n+1 table
//init:
for each i from 0 to SUM+1:
    table[0][i] = true
for each j from 1 to n:
    table[j][0] = false
//fill the table:
for each i from 1 to SUM+1:
    for each j from 1 to n+1:
         if i < arr[j]:
              table[i][j] = table[i][j-1]
         else:
              table[i][j] = table[i-arr[j]][j-1] OR table[i][j-1]

Once you have the table, you need the smallest i such that for all j: table[i][j] = false

Complexity of solution is O(n*SUM), where SUM is the sum of all elements, but note that the algorithm can actually be trimmed after the required number was found, without the need to go on for the next rows, which are un-needed for the solution.

Upvotes: 2

Related Questions