Reputation: 115
Suppose I have a 2x3 matrix A
:
1 2 3
4 5 6
and a vector y
of length 4:
0 1 2 1
as well as another 4x2 matrix B
:
0 0
1 1
2 2
3 3
I want to update the columns of A
multiple times by adding from rows of B
.
And the index of columns of A
to be updated is given by y
.
Using for
loops, this can be done as:
for i in np.arange(4):
A[:,y[i]] += B[i,:]
I had implemented this using ufunc.at
as:
np.add.at(A.T,y,B)
However, the performance of ufunc.at
is almost as bad as using for
loops.
How can I get a different vectorized implementation?
Updating using A[:,y]+=B.T
seems to update each column only once.
Upvotes: 1
Views: 286
Reputation: 221614
Approach #1
Here's one approach using summations at intervals with np.add.reduceat
-
def reduceat_app(A, y, B):
idx = y.argsort()
y0 = y[idx]
sep_idx = np.concatenate(([0], np.flatnonzero(y0[1:] != y0[:-1])+1))
A += np.add.reduceat(B[idx],sep_idx).T
Approach #2
With relatively lesser number of rows in A
, we can also use np.bincount
to perform those bin-based summations in an iterative manner for each row, like so -
def bincount_loopy_app(A, y, B):
n = A.shape[1]
for i,a in enumerate(A):
a += np.bincount(y,B[:,i],minlength=n).astype(A.dtype)
Approach #3
We can vectorize previous approach by creating a 2D
grid of y
indices/bins for all elements such that for each row, we would have offsetted bins. With that offsetting, bincount
could be used to perform bin based summations across all rows in a vectorized manner.
Thus, the implementation would be -
def bincount_vectorized_app(A, y, B):
m,n = A.shape
IDs = y[:,None] + n*np.arange(B.shape[1])
vals = np.bincount(IDs.ravel(), B.ravel(), minlength=m*n)
A += vals.astype(A.dtype).reshape((m,n))
Upvotes: 1