Reputation: 111

Finding sorted sub-sequences in a permutation

Given an array A which holds a permutation of 1,2,...,n. A sub-block A[i..j] of an array A is called a valid block if all the numbers appearing in A[i..j] are consecutive numbers (may not be in order.

Given an array A= [ 7 3 4 1 2 6 5 8] the valid blocks are [3 4], [1,2], [6,5], [3 4 1 2], [3 4 1 2 6 5], [7 3 4 1 2 6 5], [7 3 4 1 2 6 5 8]

Give an O(n log n) algorithm to count the number of valid blocks.

Upvotes: 11

Answers (4)

user382751

Reputation: 1213

Here's a worst-case O(n log n) divide and conquer algorithm. Given a non-empty sublist of a permutation, divide it into a left half, a middle element, and a right half. Recursively compute the number of blocks contained in the left half and the number of blocks contained in the right half. Now in O(n) time, compute the number of blocks that include the middle element as follows.

Observe that the intersection of two valid blocks is either empty or a valid block and that the whole permutation is a valid block. Together, these facts imply the existence of closures: unique minimal valid blocks that contain a specified (non-contiguous) subsequence. Define a left extension to be a closure of the middle element and an element not in the right half. Define a right extension to be a closure of the middle element and an element not in the left half. The left extensions (respectively, the right extensions) are totally ordered with respect to the sublist relation. They can be computed in order in linear time by means of a simple work-list algorithm.

Observe now that the union of two overlapping valid blocks is itself a valid block. I claim that every valid block containing the middle element can be written as the union of the left extension generated by the leftmost element with the right extension generated by the rightmost element. To count unions of this form, iterate through the left extensions in increasing order. Maintain pointers to the least right extension whose rightmost element is not left of the left extension's rightmost, and to the least whose leftmost element is left of the left extension's leftmost. Because of monotonicity, these pointers can only move to larger extensions, so the total work is linear. They bound above and below the eligible partners for the current left extension.

C++ implementation:

#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <queue>
#include <stdexcept>
#include <vector>

namespace {
typedef std::vector<int> IntVector;

struct Interval {
  int left;
  int right;
};

Interval MakeInterval(int left, int right) {
  Interval i = {left, right};
  return i;
}

typedef std::vector<Interval> IntervalVector;

enum Direction {
  kLeft,
  kRight,
};

// Finds the valid intervals obtained by starting with [pi[mid],
// pi[mid]] and repeatedly extending in direction dir
//
// O(right_boundary - left_boundary)
void FindExtensions(const IntVector& pi, const IntVector& pi_inv,
                    int left_boundary, int right_boundary,
                    Direction dir, IntervalVector* extensions) {
  int mid = left_boundary + (right_boundary - left_boundary) / 2;
  int left = mid;
  int right = mid;
  int lower = pi[mid];
  int upper = pi[mid];
  std::queue<int> worklist;
  while (true) {
    if (worklist.empty()) {
      extensions->push_back(MakeInterval(left, right));
      if (dir == kLeft) {
        if (left == left_boundary) break;
        --left;
        worklist.push(left);
      } else {
        if (right == right_boundary) break;
        ++right;
        worklist.push(right);
      }
    } else {
      int i = worklist.front();
      worklist.pop();
      if (i < left) {
        if (i < left_boundary) break;
        for (int j = left - 1; j >= i; --j) worklist.push(j);
        left = i;
      } else if (right < i) {
        if (right_boundary < i) break;
        for (int j = right + 1; j <= i; ++j) worklist.push(j);
        right = i;
      }
      int x = pi[i];
      if (x < lower) {
        for (int y = lower - 1; y > x; --y) worklist.push(pi_inv[y]);
        lower = x;
      } else if (upper < x) {
        for (int y = upper + 1; y < x; ++y) worklist.push(pi_inv[y]);
        upper = x;
      }
    }
  }
}

int CountValidRecursive(const IntVector& pi, const IntVector& pi_inv,
                        int left, int right) {
  if (right < left) return 0;
  int mid = left + (right - left) / 2;
  int count = CountValidRecursive(pi, pi_inv, left, mid - 1) +
      CountValidRecursive(pi, pi_inv, mid + 1, right);
  IntervalVector left_exts;
  FindExtensions(pi, pi_inv, left, right, kLeft, &left_exts);
  IntervalVector right_exts;
  FindExtensions(pi, pi_inv, left, right, kRight, &right_exts);
  typedef IntervalVector::const_iterator IVci;
  IVci first_good = right_exts.begin();
  IVci first_bad = right_exts.begin();
  for (IVci ext = left_exts.begin(); ext != left_exts.end(); ++ext) {
    while (first_good != right_exts.end() && first_good->right < ext->right) {
      ++first_good;
    }
    if (first_good == right_exts.end()) break;
    while (first_bad != right_exts.end() && ext->left <= first_bad->left) {
      ++first_bad;
    }
    count += std::distance(first_good, first_bad);
  }
  return count;
}

// Counts the number of intervals in pi that consist of consecutive
// integers
//
// O(n log n)
int CountValid(const IntVector& pi) {
  int n = pi.size();
  IntVector pi_inv(n, -1);
  // Validate and invert pi
  for (int i = 0; i < n; ++i) {
    if (pi[i] < 0 || pi[i] >= n || pi_inv[pi[i]] != -1) {
      throw std::runtime_error("Invalid permutation of {0, ..., n - 1}");
    }
    pi_inv[pi[i]] = i;
  }
  return CountValidRecursive(pi, pi_inv, 0, n - 1);
}

// For testing purposes
int SlowCountValid(const IntVector& pi) {
  int count = 0;
  int n = pi.size();
  for (int left = 0; left < n; ++left) {
    for (int right = left; right < n; ++right) {
      int lower = pi[left];
      int upper = pi[left];
      for (int i = left + 1; i <= right; ++i) {
        if (pi[i] < lower) {
          lower = pi[i];
        } else if (pi[i] > upper) {
          upper = pi[i];
        }
      }
      if (upper - lower == right - left) ++count;
    }
  }
  return count;
}
}  // namespace

int main() {
  int n = 10;
  IntVector pi(n);
  for (int i = 0; i < n; ++i) pi[i] = i;
  do {
    if (SlowCountValid(pi) != CountValid(pi)) {
      fprintf(stderr, "Bad permutation:");
      for (IntVector::const_iterator x = pi.begin(); x != pi.end(); ++x) {
        fprintf(stderr, " %d", *x);
      }
      fputc('\n', stderr);
    }
  } while (std::next_permutation(pi.begin(), pi.end()));
}

Upvotes: 18

Phil H

Reputation: 20141

Imagine you had a function which, given a list of n integers could tell you if they were a valid block or not.

Imagine you modified this function so it instead returned a count of how many sub blocks were valid, so given [1 3 2 4] would find [1 3 2 4], [1 3 2], [3 2 4], [3 2].

To make that modification, you'd just loop through all the possible sub blocks, passing them to the original function until you had just 2 numbers:

1,3,2,4 is valid
1,3,2 is valid
1,3 is NOT valid
3,2,4 is valid
3,2 is valid
There were 4 valid sub blocks

The first function then:

def isValid(li):
    return (max(li)-min(li) == len(li)-1)

That is, assuming all values are different, the largest value minus the smallest should yield the length of the array minus 1. For [1 3 2 4], largest = 4, smallest = 1, max-min=3, length-1 = 3, job done.

The main checking function:

def countValidSubs(li):
    count = 0
    length = len(li)
    for start in range(0,length-2):
        for newlen in range(length-start,1,-1):
            newli = li[start:start+newlen]
            if isValid(newli):
                print(','.join(`i` for i in newli)+" is valid")
                count += 1
            else:
                print(','.join(`i` for i in newli)+" is NOT valid")
    return count

Then just call like:

countValidSubs([1, 3, 2, 4, 5, 7, 9, 8, 6])

(Answer there is 14, by the way)

The only problem with this as a homework answer is that it is O(n²/2)

Upvotes: 1

Nick Larsen

Reputation: 18877

This is not quite a full solution, but a starting point for more thought.

The trick appears to lie in the fact that the set is always 1,2,...,n from which it is clear that the entire set is always the first obviously valid block. If you start from that full set and work your way down, it seems to be a little easier to grasp.

The set with the most valid blocks is M = [1 2 3 4 5 . . . n] because every subset [i..j] is guaranteed to be a valid block. M is a good test case because you can easily determine the actual number of valid blocks: ((n - 1) / 2) * n.

Upvotes: 1

bhups

Reputation: 14885

I don't think you need an algorithm. Here is what you need. Summation (i =0 to i = n-1 ) (n - i) * (i + 1)!

Upvotes: 0

Finding sorted sub-sequences in a permutation

Answers (4)

Related Questions