KingLui81
KingLui81

Reputation: 53

Indexing Multi-dimensional arrays

I know that multidimensional numpy arrays may be indexed with other arrays, but I did not figure out how the following works:

I would like to have the the items from raster, a 3d numpy array, based on indx, a 3d index array:

raster=np.random.rand(5,10,50)
indx=np.random.randint(0, high=50, size=(5,10,3))

What I want is another array with dimensions of indx that holds the values of raster based on the index of indx.

Upvotes: 2

Views: 1057

Answers (3)

swenzel
swenzel

Reputation: 7173

What we need in order to properly resolve your indices during broadcasting are two arrays a and b so that raster[a[i,j,k],b[i,j,k],indx[i,j,k]] will be raster[i,j,indx[i,j,k]] for i,j,k in corresponding ranges for indx's axes. The easiest solution would be:

x,y,z = indx.shape
a,b,_ = np.ogrid[:x,:y,:z]
raster[a,b,indx]

Where np.ogrid[...] creates three arrays with shapes (x,1,1), (1,y,1) and (1,1,z). We don't need the last one so we throw it away. Now when the other two are broadcast with indx they behave exactly the way we need.

Upvotes: 3

Divakar
Divakar

Reputation: 221524

If I understood the question correctly, for each row of indx, you are trying to index into the corresponding row in raster, but the column numbers vary depending on the actual values in indx. So, with that assumption, you can use a vectorized approach that uses linear indexing, like so -

M,N,R = raster.shape
linear_indx = R*np.arange(M*N)[:,None] + indx.reshape(M*N,-1)
out = raster.ravel()[linear_indx].reshape(indx.shape)

Upvotes: 1

Raniz
Raniz

Reputation: 11113

I'm assuming that you want to get 3 random values from each of the 3rd dimension arrays.

You can do this via list-comprehension thanks to advanced indexing

Here's an example using less number of values and integers so the output is easier to read:

import numpy as np

raster=np.random.randint(0, high=1000, size=(2,3,10))
indices=np.random.randint(0, high=10, size=(2,3,3))
results = np.array([ np.array([ column[col_indices] for (column, col_indices) in zip(row, row_indices) ]) for (row, row_indices) in zip(raster, indices) ])

print("Raster:")
print(raster)
print("Indices:")
print(indices)
print("Results:")
print(results)

Output:

Raster:
[[[864 353  11  69 973 475 962 181 246 385]
  [ 54 735 871 218 143 651 159 259 785 383]
  [532 476 113 888 554 587 786 172 798 232]]

 [[891 263  24 310 652 955 305 470 665 893]
  [260 649 466 712 229 474   1 382 269 502]
  [323 513  16 236 594 347 129  94 256 478]]]
Indices:
[[[0 1 2]
  [7 5 1]
  [7 8 9]]

 [[4 0 2]
  [6 1 4]
  [3 9 2]]]
Results:
[[[864 353  11]
  [259 651 735]
  [172 798 232]]

 [[652 891  24]
  [  1 649 229]
  [236 478  16]]]

It iterates simultaneously over the corresponding 3rd dimension arrays in raster and indices and uses advanced indexing to slice the desired indices from raster.

Here's a more verbose version that does the exact same thing:

results = []
for i in range(len(raster)):
    row = raster[i]
    row_indices = indices[i]
    row_results = []
    for j in range(len(row)):
        column = row[j]
        column_indices = row_indices[j]
        column_results = column[column_indices]
        row_results.append(column_results)
    results.append(np.array(row_results))
results = np.array(results)

Upvotes: 0

Related Questions