Reputation: 65
I want to create an array whose elements are a function of their position. Something like
N = 1000000
newarray = np.zeros([N,N,N])
for i in range(N):
for j in range(N):
for k in range(N):
newarray[i,j,k] = f(i,j,k)
Is there a way to increase the speed of this operation, by removing the for loops / parallelizing it using the numpy syntax?
This is the f function
def f(i,j,k):
indices = (R[:,0]==i) *( R[:,1]==j) * (R[:,2]==k)
return M[indices]
where for example
R = np.random.randint(0,N,[N,3])
M = np.random.randn(N)*15
and in the actual application they are not random.
Upvotes: 1
Views: 88
Reputation: 59701
You can do that operation with the at
method of np.add
:
import numpy as np
np.random.seed(0)
N = 100
R = np.random.randint(0, N, [N, 3])
M = np.random.randn(N) * 15
newarray = np.zeros([N, N, N])
np.add.at(newarray, (R[:, 0], R[:, 1], R[:, 2]), M)
In this case, if R
has any repeated row the corresponding value in newarray
will be the sum of all the corresponding values in M
.
EDIT: To take the average instead of sum for repeated elements you could do something like this:
import numpy as np
np.random.seed(0)
N = 100
R = np.random.randint(0, N, [N, 3])
M = np.random.randn(N) * 15
newarray = np.zeros([N, N, N])
np.add.at(newarray, (R[:, 0], R[:, 1], R[:, 2]), M)
newarray_count = np.zeros([N, N, N])
np.add.at(newarray_count, (R[:, 0], R[:, 1], R[:, 2]), 1)
m = newarray_count > 1
newarray[m] /= newarray_count[m]
Upvotes: 2