Reputation: 488
I have a piece of numpy code that I know works. I know this because I have tested it in my generic case successfully. However, I arrived at the solution after two hours of back and forth referencing the docs and trial and error. I can't grasp how I would know to do this intuitively.
The setup:
a = np.zeros((5,5,3))
The goal: Set to 1 indices 0,1 of axis 1, 0,1 of axis 2, all of axis 3 and indices 3,4 of axis 1, 3,4 of axis 2, all of axis 3
Clearer goal: Set block 1 and 2's first two rows to 1 and block 3 and 4's last two rows to 1
The result:
ax1 =np.array([np.array([0,1]),np.array([3,4])])
ax1 =np.array([x[:,np.newaxis] for x in ax1])
ax2 = np.array([[[0,1]],[[3,4]]])
a[ax1,ax2,:] = 1
a
Output:
array([[[1., 1., 1.],
[1., 1., 1.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[1., 1., 1.],
[1., 1., 1.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[1., 1., 1.],
[1., 1., 1.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[1., 1., 1.],
[1., 1., 1.]]])
I'm inclined to believe I should be able to look at the shape of the matrix in question, the shape of the indices, and the index operation to intuitively know the output. However, I can't put the story together in my head. Like, what's the final shape of the subspace it is altering? How would you explain how this works?
The shapes:
input: (5, 5, 3)
ind1: (2, 2, 1)
ind2: (2, 1, 2)
final_op: input[ind1, ind2, :]
Upvotes: 1
Views: 37
Reputation: 231325
With shapes
ind1: (2, 2, 1)
ind2: (2, 1, 2)
they broadcast
together to select a (2,2,2) space
In [4]: ax1
Out[4]:
array([[[0],
[1]],
[[3],
[4]]])
In [5]: ax2
Out[5]:
array([[[0, 1]],
[[3, 4]]])
So for the 1st dimension (blocks) it is selecting blocks 0,1,3,and 4. In the second dimension it is also selecting these rows.
Together that's the first 2 rows of the first 2 blocks, and the last 2 rows of the last 2 blocks. That's where the 1s appear in your result.
A simpler way of creating the index arrays:
In [7]: np.array([[0,1],[3,4]])[:,:,None] # (2,2) expanded to (2,2,1)
In [8]: np.array([[0,1],[3,4]])[:,None,:] # expand to (2,1,2)
This is how broadcasting expands them:
In [10]: np.broadcast_arrays(ax1,ax2)
Out[10]:
[array([[[0, 0], # block indices
[1, 1]],
[[3, 3],
[4, 4]]]),
array([[[0, 1], # row indices
[0, 1]],
[[3, 4],
[3, 4]]])]
This may make the pattern clearer:
In [15]: a[ax1,ax2,:] = np.arange(1,5).reshape(2,2,1)
In [16]: a[:,:,0]
Out[16]:
array([[1., 2., 0., 0., 0.],
[3., 4., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 1., 2.],
[0., 0., 0., 3., 4.]])
Upvotes: 2