Reputation: 105

How to find minimum positive contiguous sub sequence in O(n) time?

We have this algorithm for finding maximum positive sub sequence in given sequence in O(n) time. Can anybody suggest similar algorithm for finding minimum positive contiguous sub sequence.

For example If given sequence is 1,2,3,4,5 answer should be 1. [5,-4,3,5,4] ->1 is the minimum positive sum of elements [5,-4].

Upvotes: 11

Answers (3)

Bert te Velde

Reputation: 853

I believe there's a O(n) algorithm, see below.

Note: it has a scale factor that might make it less attractive in practical applications: it depends on the (input) values to be processed, see remarks in the code.

private int GetMinimumPositiveContiguousSubsequenc(List<Int32> values)
    {
      // Note: this method has no precautions against integer over/underflow, which may occur
      // if large (abs) values are present in the input-list.

      // There must be at least 1 item.
      if (values == null || values.Count == 0)
        throw new ArgumentException("There must be at least one item provided to this method.");

      // 1. Scan once to:
      //    a) Get the mimumum positive element;
      //    b) Get the value of the MAX contiguous sequence
      //    c) Get the value of the MIN contiguous sequence - allowing negative values: the mirror of the MAX contiguous sequence.
      //    d) Pinpoint the (index of the) first negative value.

      int minPositive = 0;

      int maxSequence = 0;
      int currentMaxSequence = 0;

      int minSequence = 0;
      int currentMinSequence = 0;

      int indxFirstNegative = -1;

      for (int k = 0; k < values.Count; k++)
      {
        int value = values[k];

        if (value > 0)
          if (minPositive == 0 || value < minPositive)
            minPositive = value;
          else if (indxFirstNegative == -1 && value < 0)
            indxFirstNegative = k;

        currentMaxSequence += value;
        if (currentMaxSequence <= 0)
          currentMaxSequence = 0;
        else if (currentMaxSequence > maxSequence)
          maxSequence = currentMaxSequence;

        currentMinSequence += value;
        if (currentMinSequence >= 0)
          currentMinSequence = 0;
        else if (currentMinSequence < minSequence)
          minSequence = currentMinSequence;
      }

      // 2. We're done if (a) there are no negatives, or (b) the minPositive (single) value is 1 (or 0...).
      if (minSequence == 0 || minPositive <= 1)
        return minPositive;

      // 3. Real work to do.
      // The strategy is as follows, iterating over the input values:
      // a) Keep track of the cumulative value of ALL items - the sequence that starts with the very first item.
      // b) Register each such cumulative value as "existing" in a bool array 'initialSequence' as we go along.
      //    We know already the max/min contiguous sequence values, so we can properly size that array in advance.
      //    Since negative sequence values occur we'll have an offset to match the index in that bool array
      //    with the corresponding value of the initial sequence.
      // c) For each next input value to process scan the "initialSequence" bool array to see whether relevant entries are TRUE.
      //    We don't need to go over the complete array, as we're only interested in entries that would produce a subsequence with
      //    a value that is positive and also smaller than best-so-far.
      //    (As we go along, the range to check will normally shrink as we get better and better results.
      //     Also: initially the range is already limited by the single-minimum-positive value that we have found.)

      // Performance-wise this approach (which is O(n)) is suitable IFF the number of input values is large (or at least: not small) relative to
      // the spread between maxSequence and minSeqence: the latter two define the size of the array in which we will do (partial) linear traversals.
      // If this condition is not met it may be more efficient to replace the bool array by a (binary) search tree.
      // (which will result in O(n logn) performance).
      // Since we know the relevant parameters at this point, we may below have the two strategies both implemented and decide run-time
      // which to choose.
      // The current implementation has only the fixed bool array approach.

      // Initialize a variable to keep track of the best result 'so far'; it will also be the return value.
      int minPositiveSequence = minPositive;

      // The bool array to keep track of which (total) cumulative values (always with the sequence starting at element #0) have occurred so far,
      // and the 'offset' - see remark 3b above.
      int offset = -minSequence;
      bool[] initialSequence = new bool[maxSequence + offset + 1];

      int valueCumulative = 0;

      for (int k = 0; k < indxFirstNegative; k++)
      {
        int value = values[k];
        valueCumulative += value;
        initialSequence[offset + valueCumulative] = true;
      }

      for (int k = indxFirstNegative; k < values.Count; k++)
      {
        int value = values[k];

        valueCumulative += value;
        initialSequence[offset + valueCumulative] = true;

        // Check whether the difference with any previous "cumulative" may improve the optimum-so-far.

        // the index that, if the entry is TRUE, would yield the best possible result.
        int indexHigh = valueCumulative + offset - 1;
        // the last (lowest) index that, if the entry is TRUE, would still yield an improvement over what we have so far.
        int indexLow = Math.Max(0, valueCumulative + offset - minPositiveSequence + 1);

        for (int indx = indexHigh; indx >= indexLow; indx--)
        {
          if (initialSequence[indx])
          {
            minPositiveSequence = valueCumulative - indx + offset;

            if (minPositiveSequence == 1)
              return minPositiveSequence;

            break;
          }
        }
      }

      return minPositiveSequence;
    }
  }

Upvotes: 1

Pham Trung

Reputation: 11284

We can have a O(n log n) algorithm as follow:

Assuming that we have an array prefix, which index i stores the sum of array A from 0 to i, so the sum of sub-array (i, j) is prefix[j] - prefix[i - 1].

Thus, in order to find the minimum positive sub-array ending at index j, so, we need to find the maximum element prefix[x], which less than prefix[j] and x < j. We can find that element in O(log n) time if we use a binary search tree.

Pseudo code:

int[]prefix = new int[A.length];
prefix[0] = A[0];
for(int i = 1; i < A.length; i++)
    prefix[i] = A[i] + prefix[i - 1];
int result = MAX_VALUE;
BinarySearchTree tree;
for(int i = 0; i < A.length; i++){
    if(A[i] > 0)
       result = min(result, A[i];
    int v = tree.getMaximumElementLessThan(prefix[i]);
    result = min(result, prefix[i] - v);
    tree.add(prefix[i]);
}

Upvotes: 3

Juan Lopes

Reputation: 10565

There cannot be such algorithm. The lower bound for this problem is O(n log n). I'll prove it by reducing the element distinctness problem to it (actually to the non-negative variant of it).

Let's suppose we have an O(n) algorithm for this problem (the minimum non-negative subarray).

We want to find out if an array (e.g. A=[1, 2, -3, 4, 2]) has only distinct elements. To solve this problem, I could construct an array with the difference between consecutive elements (e.g. A'=[1, -5, 7, -2]) and run the O(n) algorithm we have. The original array only has distinct elements if and only if the minimum non-negative subarray is greater than 0.

If we had an O(n) algorithm to your problem, we would have an O(n) algorithm to element distinctness problem, which we know is not possible on a Turing machine.

Upvotes: 4

How to find minimum positive contiguous sub sequence in O(n) time?

Answers (3)

Related Questions