Tuples of closed continuous intervals

Question

Say I have the following list of numbers:

my_array = [0, 3, 4, 7, 8, 9, 10, 20, 21, 22, 70]

I would like to find every closed interval containing consecutive integers without gaps in this list. If for any number in the list there are multiple such intervals, we only retain the largest of any such intervals. The correct answer above should be:

[0,   0]
[3,   4]
[7,  10]
[20, 22]
[70, 70]

To see this, note for example:

The closed interval [0,0] contains the integer 0, contains no gaps, and none of its members are contained in any other closed interval.
The closed interval [3,4] contains no gaps and its members are not contained in any other closed interval with no gaps that is larger than itself.

How can I do this in numpy? I started writing an algorithm that uses np.diff(my_array) to detect transitions in the array, but it fails on corner cases such as intervals containing only one item.

David Eisenstat · Accepted Answer

I don't have a numpy install handy, but this is the approach that I would take. First handle the case of an empty array separately. Sort the array if it isn't already sorted and use np.diff to compute the differences.

0, 3, 4, 7, 8, 9, 10,  20, 21, 22,  70
  3  1  3  1  1  1   10   1   1   48

Test the differences for being > 1.

  1  0  1  0  0  0    1   0   0    1

To get the beginnings of the intervals, put a 1 at the beginning and select the corresponding array items. To get the ends, put a 1 at the end and select the corresponding array items.

0, 3, 4, 7, 8, 9, 10,  20, 21, 22,  70
1  1  0  1  0  0   0    1   0   0    1
0  3     7             20           70

0, 3, 4, 7, 8, 9, 10,  20, 21, 22,  70
1  0  1  0  0  0   1    0   0   1    1
0     4           10           22   70

Implementation (largely by user815423426):

def get_intervals(my_array):
  my_diff = np.diff(my_array)>1
  begins  = np.insert(my_diff, 0, 1)
  ends    = np.insert(my_diff, -1, 1)
  return np.array(np.dstack((my_array[begins], my_array[ends])))

my_array = np.array([0, 3, 4, 7, 8, 9, 10, 20, 21, 22, 70])
> get_intervals(my_array)
array([[ 0,  0],
       [ 3,  4],
       [ 7, 10],
       [20, 22],
       [70, 70]])

Tuples of closed continuous intervals

Answers (2)

Related Questions