Reputation: 195
I am writing a program in python and I want to vectorize it as much as possible. I have the following variables
E
with shape (L,T)
.w
with shape (N,)
with arbitrary values.index
with shape (A,)
whose values are integers between 0
and N-1
. The values are unique.labels
with a shape the same as w
((A,)
), whose values are integers between 0
and L-1
. The values are not necessarily unique. t
between 0
and T-1
.We want to add the values of w
at indices index
to the array E
at rows labels
and column t
. I used the following code:
E[labels,t] += w[index]
But this approach does not give desired results. For example,
import numpy as np
E = np.zeros([10,1])
w = np.arange(0,100)
index = np.array([1,3,4,12,80])
labels = np.array([0,0,5,5,2])
t = 0
E[labels,t] += w[index]
Gives
array([[ 3.],
[ 0.],
[80.],
[ 0.],
[ 0.],
[12.],
[ 0.],
[ 0.],
[ 0.],
[ 0.]])
But the correct answer would be
array([[ 4.],
[ 0.],
[80.],
[ 0.],
[ 0.],
[16.],
[ 0.],
[ 0.],
[ 0.],
[ 0.]])
Is there a way to achieve this behavior without using a for loop?
I realized I can use this: np.add.at(E,[labels,t],w[index])
but it gives me this warning:
FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
Upvotes: 1
Views: 112
Reputation: 541
Pulled from a similar question, you can use np.bincount() to achieve your goal:
import numpy as np
import time
E = np.zeros([10,1])
w = np.arange(0,100)
index = np.array([1,3,4,12,80])
labels = np.array([0,0,5,5,2])
t = 0
# --------- Using np.bincount()
start = time.perf_counter()
for _ in range(10000):
E = np.zeros([10,1])
values = w[index]
result = np.bincount(labels, values, E.shape[0])
E[:, t] += result
print("Bin count time: {}".format(time.perf_counter() - start))
print(E)
# --------- Using for loop
for _ in range(10000):
E = np.zeros([10,1])
for i, in_ in enumerate(index):
E[labels[i], t] += w[in_]
print("For loop time: {}".format(time.perf_counter() - start))
print(E)
Gives:
Bin count time: 0.045003452
[[ 4.]
[ 0.]
[80.]
[ 0.]
[ 0.]
[16.]
[ 0.]
[ 0.]
[ 0.]
[ 0.]]
For loop time: 0.09853353699999998
[[ 4.]
[ 0.]
[80.]
[ 0.]
[ 0.]
[16.]
[ 0.]
[ 0.]
[ 0.]
[ 0.]]
Upvotes: 1