DMin
DMin

Reputation: 10353

Convert numpy array to a range

I have a sorted list of numbers:

[ 1,  2,  3,  4,  6,  7,  8,  9, 14, 15, 16, 17, 18, 25, 
 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 38, 45]

I want to know if numpy has a built-in feature to get something like this out of it:

[ [1, 4], [6, 9], [14, 18], [25,36], [38], [45] ]

Also great if it can ignore holes of certain 2-3 numbers missing in between would still make the range.

I am basically listing frame numbers of a video for processing -- so rather than list out all the frame numbers I would jut put in ranges and its okay if a 3-4 frames in between are missing.

Just want to know if there something that already implements this logic as it seems like a common thing people would want to do - otherwise, I'll implement it myself.

Edit: found a very close question that's answered already: converting a list of integers into range in python

Upvotes: 0

Views: 885

Answers (2)

mathfux
mathfux

Reputation: 5939

numpy is not designed to work with to work with arrays that differs in size. It uses internally list.append to append multiple list to empty array using slicing. It really slows the things down and it's recommended to use only in case user is forced to.

Algorithm of np.split

To better see, how could you improve, let's take a look at general pattern of finding np.split-like indices of array split:

arr = [1, 2, 3, 4, 6, 7, 8, 9, 14, 15, 16, 17, 18, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 38, 45]
div_points = np.flatnonzero(np.diff(arr)!=1) + 1
start_points = np.r_[0, div_points]
end_points = np.r_[div_points, len(arr)]

So now you've got start and end arguments of slices that split an array into multiple sub-arrays:

np.transpose([start_points, end_points])
>>> array([[ 0,  4],
           [ 4,  8],
           [ 8, 13],
           [13, 25],
           [25, 26],
           [26, 27]])

And there is a mechanism that np.split uses internally to split an array:

container = []
for start, end in np.transpose([start_points, end_points]):
    container.append(arr[start:end])

>>> container 
[[1, 2, 3, 4],
 [6, 7, 8, 9],
 [14, 15, 16, 17, 18],
 [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36],
 [38],
 [45]]

Variation of algorithm

To arrive at an output nearly similar to what you expect, you could modify algorithm of np.split like so:

div_points = np.flatnonzero(np.diff(arr)!=1) + 1
start_points = np.r_[0, div_points]
end_points = np.r_[div_points, len(arr)]
out = np.transpose([arr[start_points], arr[end_points - 1]])
out
>>> array([[ 1,  4],
           [ 6,  9],
           [14, 18],
           [25, 36],
           [38, 38],
           [45, 45]])

Upvotes: 1

user7864386
user7864386

Reputation:

Since it's tagged as numpy, here is a numpy solution (sort of). There is no native numpy function but you could use diff + where + split + list comprehension:

>>> [[ary[0], ary[-1]] if len(ary)>1 else [ary[0]] for ary in np.split(arr, np.where(np.diff(arr) != 1)[0] + 1)]
[[1, 4], [6, 9], [14, 18], [25, 36], [38], [45]]

If the array is large, it's more efficient to use a loop rather than np.split, so you could use the function below which produces the same outcome as above:

def array_to_range(arr):
    idx = np.r_[0, np.where(np.diff(arr) != 1)[0]+1, len(arr)]
    out = []
    for i,j in zip(idx, idx[1:]):
        ary = arr[i:j]
        if len(ary) > 1:
            out.append([ary[0], ary[-1]])
        else:
            out.append([ary[0]])
    return out

Upvotes: 5

Related Questions