Reputation: 5771
index = np.array([[1,2],[2,4],[1,5],[5,6]])
z = np.zeros(shape = [4,10], dtype = np.float32)
What is the efficient way to set z[np.arange(4),index[:,0]]
, z[np.arange(4), index[:,1]]
and everything between them as 1?
expected output:
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])
Upvotes: 4
Views: 2033
Reputation: 231698
I think this is what you want to do - but with the loop:
In [35]: z=np.zeros((4,10),int)
In [36]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
In [37]: for i in range(4):
...: z[i,index[i,0]:index[i,1]] = 1
...:
In [38]: z
Out[38]:
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]])
Since there differing length slices, this will be tricky to do with one array expression. Maybe not impossible, but tricky enough that it might not be worth trying.
Look at the indices of the 1s in this z
:
In [40]: np.where(z)
Out[40]:
(array([0, 1, 1, 2, 2, 2, 2, 3], dtype=int32),
array([1, 2, 3, 1, 2, 3, 4, 5], dtype=int32))
Is there a regular pattern that could be generated [0,1,2,3] and index
?
I can generate the 2nd row with a concatenation of slices:
In [39]: np.r_[1:2, 2:4, 1:5, 5:6]
Out[39]: array([1, 2, 3, 1, 2, 3, 4, 5])
But notice that r_
involves several iterations - to generate the input, to generate the expanded slices, and to concatenate them.
I can generate the first row of the where
with:
In [41]: index[:,1]-index[:,0]
Out[41]: array([1, 2, 4, 1])
In [42]: np.arange(4).repeat(_)
Out[42]: array([0, 1, 1, 2, 2, 2, 2, 3])
and as expected, those 2 index arrays give us all the 1s:
In [43]: z[Out[42],Out[39]]
Out[43]: array([1, 1, 1, 1, 1, 1, 1, 1])
Or to generate Out[39]
from index
:
In [50]: np.concatenate([np.arange(i,j) for i,j in index])
Out[50]: array([1, 2, 3, 1, 2, 3, 4, 5])
Comparing my solutions with @Divakar's
def foo0(z,index):
for i in range(z.shape[0]):
z[i,index[i,0]:index[i,1]] = 1
return z
def foo4(z,index):
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1
return z
For this small example, row iteration is faster:
In [155]: timeit foo0(z,index)
7.12 µs ± 224 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [156]: timeit foo4(z,index)
19.8 µs ± 890 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Even for larger arrays, the row iteration approach is faster:
In [157]: Z.shape
Out[157]: (1000, 1000)
In [158]: Index.shape
Out[158]: (1000, 2)
In [159]: timeit foo0(Z,Index)
1.72 ms ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [160]: timeit foo4(Z,Index)
7.47 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Upvotes: 1
Reputation: 221734
We can leverage NumPy broadcasting
for a vectorized solution by simply comparing the start and end indices against the ranged array covering the length of columns to give us a mask that represents all the places in the output array required to be assigned as 1s
.
So, the solution would be something like this -
ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1
Sample run -
In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
...: z = np.zeros(shape = [4,10], dtype = np.float32)
In [40]: ncols = z.shape[1]
...: r = np.arange(z.shape[1])
...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
...: z[mask] = 1
In [41]: z
Out[41]:
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
[0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)
If z
is always a zeros-initialized
array, we can directly get the output from mask
-
z = mask.astype(int)
Sample run -
In [37]: mask.astype(int)
Out[37]:
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])
Comparing @hpaulj's foo0
and mine foo4
as listed in @hpaulj's post for a set with 1000
rows and variable number of columns. We are starting with 10
columns as that was how the input sample was listed and we are giving it a bigger number of rows - 1000
. We would increase the number of columns to 1000
.
Here's the timings -
In [14]: ncols = 10
...: index = np.random.randint(0,ncols,(10000,2))
...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)
In [15]: %timeit foo0(z,index)
...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop
In [16]: ncols = 100
...: index = np.random.randint(0,ncols,(10000,2))
...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)
In [17]: %timeit foo0(z,index)
...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop
In [38]: ncols = 300
...: index = np.random.randint(0,ncols,(1000,2))
...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)
In [39]: %timeit foo0(z,index)
...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop
In [40]: ncols = 1000
...: index = np.random.randint(0,ncols,(1000,2))
...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)
In [41]: %timeit foo0(z,index)
...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop
Thus, choosing the best one would depend on the number of columns of the problem set between the loopy and the broadcasting based vectorized one.
Upvotes: 4