Reputation: 323
I have a (numpy) array representing a measurement curve. I am looking for the first index i
following which the subsequent N
elements satisfy some condition, e.g. lie within specific bounds. In pseudo code words I am looking for the minimal i
such that
lower_bound < measurement[i:i+N] < higher_bound
is satisfied for all elements in the range.
Of course I could do the following:
for i in xrange(len(measurement) - N):
test_vals = measurement[i:i + N]
if all([True if lower_bound < x < higher_bound else False for x in test_vals]):
return i
This is extremely inefficent as I am always comparing N
values for every i
.
What is the most pythonic way to achieve this? Has Numpy some built-in functionalities to find this?
EDIT: As per request I provide some example input data
a = [1,2,3,4,5,5,6,7,8,5,4,5]
lower_bound = 3.5
upper_bound = 5.5
N = 3
should return 3
as starting at a[3]
the elements are within the bounds for at least 3 values.
Upvotes: 4
Views: 1282
Reputation: 18628
If M is the length of a, here is a O(M) solution.
locations=(lower_bound<a) & (a<upper_bound)
cum=locations.cumsum()
lengths=np.roll(cum,-N)-cum==N
result=lengths.nonzero()[0][0]+1
Upvotes: 2
Reputation: 365
This answer could be helpful to you, although it is not specifically for numpy:
What is the best way to get the first item from an iterable matching a condition?
Upvotes: 0
Reputation: 221564
One NumPythonic vectorized solution would be to create sliding windows across the entire length of the input array measurement
stacked as a 2D array, then index into the array with those indices to form a 2D array version of measurement
. Next, look for bound successes in one go with np.all(..axis=1)
after bound checks. Finally choose the first success index as the output. The implementation would go something along these lines -
m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Sample run -
In [1]: measurement = np.array([1,2,3,4,5,5,6,7,8,5,4,5])
...: lower_bound = 3.5
...: higher_bound = 5.5
...: N = 3
...:
In [2]: m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
In [3]: m2D # Notice that is a 2D array (shifted) version of input
Out[3]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 5],
[5, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 5],
[8, 5, 4],
[5, 4, 5]])
In [4]: np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Out[4]: 3
Upvotes: 3