Reputation: 10353
I have a sorted list of numbers:
[ 1, 2, 3, 4, 6, 7, 8, 9, 14, 15, 16, 17, 18, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 38, 45]
I want to know if numpy has a built-in feature to get something like this out of it:
[ [1, 4], [6, 9], [14, 18], [25,36], [38], [45] ]
Also great if it can ignore holes of certain 2-3 numbers missing in between would still make the range.
I am basically listing frame numbers of a video for processing -- so rather than list out all the frame numbers I would jut put in ranges and its okay if a 3-4 frames in between are missing.
Just want to know if there something that already implements this logic as it seems like a common thing people would want to do - otherwise, I'll implement it myself.
Edit: found a very close question that's answered already: converting a list of integers into range in python
Upvotes: 0
Views: 885
Reputation: 5939
numpy
is not designed to work with to work with arrays that differs in size. It uses internally list.append
to append multiple list to empty array using slicing. It really slows the things down and it's recommended to use only in case user is forced to.
Algorithm of np.split
To better see, how could you improve, let's take a look at general pattern of finding np.split
-like indices of array split:
arr = [1, 2, 3, 4, 6, 7, 8, 9, 14, 15, 16, 17, 18, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 38, 45]
div_points = np.flatnonzero(np.diff(arr)!=1) + 1
start_points = np.r_[0, div_points]
end_points = np.r_[div_points, len(arr)]
So now you've got start and end arguments of slices that split an array into multiple sub-arrays:
np.transpose([start_points, end_points])
>>> array([[ 0, 4],
[ 4, 8],
[ 8, 13],
[13, 25],
[25, 26],
[26, 27]])
And there is a mechanism that np.split
uses internally to split an array:
container = []
for start, end in np.transpose([start_points, end_points]):
container.append(arr[start:end])
>>> container
[[1, 2, 3, 4],
[6, 7, 8, 9],
[14, 15, 16, 17, 18],
[25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36],
[38],
[45]]
Variation of algorithm
To arrive at an output nearly similar to what you expect, you could modify algorithm of np.split
like so:
div_points = np.flatnonzero(np.diff(arr)!=1) + 1
start_points = np.r_[0, div_points]
end_points = np.r_[div_points, len(arr)]
out = np.transpose([arr[start_points], arr[end_points - 1]])
out
>>> array([[ 1, 4],
[ 6, 9],
[14, 18],
[25, 36],
[38, 38],
[45, 45]])
Upvotes: 1
Reputation:
Since it's tagged as numpy, here is a numpy solution (sort of). There is no native numpy function but you could use diff
+ where
+ split
+ list comprehension:
>>> [[ary[0], ary[-1]] if len(ary)>1 else [ary[0]] for ary in np.split(arr, np.where(np.diff(arr) != 1)[0] + 1)]
[[1, 4], [6, 9], [14, 18], [25, 36], [38], [45]]
If the array is large, it's more efficient to use a loop rather than np.split
, so you could use the function below which produces the same outcome as above:
def array_to_range(arr):
idx = np.r_[0, np.where(np.diff(arr) != 1)[0]+1, len(arr)]
out = []
for i,j in zip(idx, idx[1:]):
ary = arr[i:j]
if len(ary) > 1:
out.append([ary[0], ary[-1]])
else:
out.append([ary[0]])
return out
Upvotes: 5