Reputation: 59
what i need to achieve is to get array of all indexes, where in my data array filled with zeros and ones is step from zero to one. I need very quick solution, because i have to work with milions of arrays of hundrets milions length. It will be running in computing centre. For instance..
data_array = np.array([1,1,0,1,1,1,0,0,0,1,1,1,0,1,1,0])
result = [3,9,13]
Upvotes: 2
Views: 89
Reputation: 221574
Since it's an array filled with 0s
and 1s
, you can benefit from just comparing rather than performing arithmetic operation between the one-shifted versions to directly give us the boolean array, which could be fed to np.flatnonzero
to get us the indices and the final output.
Thus, we would have an implementation like so -
np.flatnonzero(data_array[1:] > data_array[:-1])+1
Runtime test -
In [26]: a = np.random.choice([0,1], 10**8)
In [27]: %timeit np.nonzero((a[1:] - a[:-1]) == 1)[0] + 1
1 loop, best of 3: 1.91 s per loop
In [28]: %timeit np.where(np.diff(a)==1)[0] + 1
1 loop, best of 3: 1.91 s per loop
In [29]: %timeit np.flatnonzero(a[1:] > a[:-1])+1
1 loop, best of 3: 954 ms per loop
Upvotes: 0
Reputation: 59
Well thanks a lot to all of you. Solution with nonzero is probably better for me, because I need to know steps from 0->1 and also 1->0 and finally calculate differences. So this is my solution. Any other advice appreciated .)
i_in = np.nonzero( (data_array[1:] - data_array[:-1]) == 1 )[0] +1
i_out = np.nonzero( (data_array[1:] - data_array[:-1]) == -1 )[0] +1
i_return_in_time = (i_in - i_out[:i_in.size] )
Upvotes: 0
Reputation: 210842
try this:
In [23]: np.where(np.diff(a)==1)[0] + 1
Out[23]: array([ 3, 9, 13], dtype=int64)
Timing for 100M element array:
In [46]: a = np.random.choice([0,1], 10**8)
In [47]: %timeit np.nonzero((a[1:] - a[:-1]) == 1)[0] + 1
1 loop, best of 3: 1.46 s per loop
In [48]: %timeit np.where(np.diff(a)==1)[0] + 1
1 loop, best of 3: 1.64 s per loop
Upvotes: 3
Reputation: 68146
Here's the procedure:
len(diff) = len(orig) - 1
)So try this:
index = numpy.nonzero((data_array[1:] - data_array[:-1]) == 1)[0] + 1
index
# [3, 9, 13]
Upvotes: 1