FooBar
FooBar

Reputation: 16478

Using an ND array to select on a dimension

How do I use an ndarray matrix to select elements of an ndarray?

Here's an example of what I mean.

a = np.arange(9)
b = np.arange(5)
c = np.arange(12)
A, B, C = np.meshgrid(a, b, c, indexing='ij')

Now, for each value of a, c, I want the b that minimizes A+C=B. Get the indices:

idx = np.abs(A+C-B).argmin(axis=1)

Clearly, idx has shape (9, 12) - it contains the index of b for each of the 9 a, and each of the 12 c.

Now, I would like to select the matrices with the "optimized b". That is, something along the lines of

B[:, idx, :]

that supposedly has shape (9, 1, 12) - because for each of the other combinations, it has only one value of b - the minimizing one. Now, B[:, idx, :] instead gives me the mesh of all potential combinations with shape (9, 9, 12, 12). I also tried

B[np.arange(B.shape[0]), idx, np.arange(B.shape[2])]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (9,) (9,12) (12,)

How do I get that specific type of matrix I described above?

Upvotes: 2

Views: 58

Answers (1)

Divakar
Divakar

Reputation: 221504

You just need to add an axis there with np.newaxis/None to trigger advanced-indexing -

B[np.arange(B.shape[0])[:,None], idx, np.arange(B.shape[2])]

The idea basically is to map the rows of idx with the first indexing array of np.arange(B.shape[0]) and as such we need to add an axis there. For mapping the columns of idx, we already have np.arange(B.shape[2]) aligned along the columns of it.

Alternative to np.newaxis

Another way to add that new axis would be with reshaping Thus, we could replace B[np.arange(B.shape[0])[:,None] with np.arange(B.shape[0]).reshape(-1,1).


Further optimization

We could optimize the codes, by using open arrays to replace the huge arrays created by meshgrid, like so -

A0, B0, C0 = np.ix_(a,b,c)
idx = np.abs(A0+C0-B0).argmin(axis=1)

Thus, get the final output, like so -

B[np.arange(len(a))[:,None], idx, np.arange(len(c))]

Just to give ourselves the idea of memory saving here -

In [47]: A.nbytes + B.nbytes + C.nbytes
Out[47]: 12960

whereas A0, B0, C0 are views into the input arrays a, b, c respectively and as such don't occupy any additional memory, i.e. absolutely free -

In [49]: np.shares_memory(a,A0)
Out[49]: True

For completeness sake, a direct way to get idx would be -

np.abs(a[:,None,None]+c-b[:,None]).argmin(axis=1)

Upvotes: 1

Related Questions