Reputation: 53
I know that multidimensional numpy arrays may be indexed with other arrays, but I did not figure out how the following works:
I would like to have the the items from raster
, a 3d numpy array, based on indx
, a 3d index array:
raster=np.random.rand(5,10,50)
indx=np.random.randint(0, high=50, size=(5,10,3))
What I want is another array with dimensions of indx
that holds the values of raster
based on the index of indx
.
Upvotes: 2
Views: 1057
Reputation: 7173
What we need in order to properly resolve your indices during broadcasting are two arrays a
and b
so that raster[a[i,j,k],b[i,j,k],indx[i,j,k]]
will be raster[i,j,indx[i,j,k]]
for i
,j
,k
in corresponding ranges for indx
's axes.
The easiest solution would be:
x,y,z = indx.shape
a,b,_ = np.ogrid[:x,:y,:z]
raster[a,b,indx]
Where np.ogrid[...]
creates three arrays with shapes (x,1,1)
, (1,y,1)
and (1,1,z)
. We don't need the last one so we throw it away. Now when the other two are broadcast with indx
they behave exactly the way we need.
Upvotes: 3
Reputation: 221524
If I understood the question correctly, for each row of indx
, you are trying to index into the corresponding row in raster
, but the column numbers vary depending on the actual values in indx
. So, with that assumption, you can use a vectorized approach that uses linear indexing, like so -
M,N,R = raster.shape
linear_indx = R*np.arange(M*N)[:,None] + indx.reshape(M*N,-1)
out = raster.ravel()[linear_indx].reshape(indx.shape)
Upvotes: 1
Reputation: 11113
I'm assuming that you want to get 3 random values from each of the 3rd dimension arrays.
You can do this via list-comprehension thanks to advanced indexing
Here's an example using less number of values and integers so the output is easier to read:
import numpy as np
raster=np.random.randint(0, high=1000, size=(2,3,10))
indices=np.random.randint(0, high=10, size=(2,3,3))
results = np.array([ np.array([ column[col_indices] for (column, col_indices) in zip(row, row_indices) ]) for (row, row_indices) in zip(raster, indices) ])
print("Raster:")
print(raster)
print("Indices:")
print(indices)
print("Results:")
print(results)
Output:
Raster:
[[[864 353 11 69 973 475 962 181 246 385]
[ 54 735 871 218 143 651 159 259 785 383]
[532 476 113 888 554 587 786 172 798 232]]
[[891 263 24 310 652 955 305 470 665 893]
[260 649 466 712 229 474 1 382 269 502]
[323 513 16 236 594 347 129 94 256 478]]]
Indices:
[[[0 1 2]
[7 5 1]
[7 8 9]]
[[4 0 2]
[6 1 4]
[3 9 2]]]
Results:
[[[864 353 11]
[259 651 735]
[172 798 232]]
[[652 891 24]
[ 1 649 229]
[236 478 16]]]
It iterates simultaneously over the corresponding 3rd dimension arrays in raster and indices and uses advanced indexing to slice the desired indices from raster.
Here's a more verbose version that does the exact same thing:
results = []
for i in range(len(raster)):
row = raster[i]
row_indices = indices[i]
row_results = []
for j in range(len(row)):
column = row[j]
column_indices = row_indices[j]
column_results = column[column_indices]
row_results.append(column_results)
results.append(np.array(row_results))
results = np.array(results)
Upvotes: 0