Reputation: 13
I have 3 Numpy arrays, a
, b
and c
.
b
and c
are very large arrays and have the same length. Each element of b
is 0, 1 or 2 and also the length of a
is 3.
Now I wonder if there is a way to eliminate the following for loop:
for i in range(len(b)):
a[b[i]] += c[i]
Any comment would be greatly appreciated.
Upvotes: 1
Views: 107
Reputation: 221614
You can use np.bincount
for such ID
based summing, like so -
a += np.bincount(b,c,minlength=a.size)
Runtime test -
In [136]: # Large arrays as inputs
...: a = np.random.rand(3)
...: c = np.random.rand(10000)
...: b = np.random.randint(0,3,10000)
...:
...: # Make copies for testing
...: a1 = a.copy()
...: a2 = a.copy()
...:
In [137]: def bincount_app(a, b, c): # bincount approach as func
...: a += np.bincount(b,c,minlength=a.size)
...:
In [138]: %timeit np.add.at(a1, b, c) # @user2357112's soln
1000 loops, best of 3: 1.29 ms per loop
In [139]: %timeit bincount_app(a2, b, c)
10000 loops, best of 3: 36.6 µs per loop
Upvotes: 2
Reputation: 281476
NumPy ufuncs have an at
method for cases like this:
numpy.add.at(a, b, c)
This does what everyone expects a[b] += c
to do for an array b
of indices before they see that it doesn't work.
Upvotes: 3