Reputation: 55
I have a matrix M
of shape (N, L)
and a 3D tensor P
of shape (N, L, K)
. I want to get matrix V
of shape (N, K)
where V[i] = M[i] @ P[i]
. I can do it with for loop but that's inefficient, I want to do it with a single or few operations so that it would run in parallel on CUDA.
I tried just multiplying it like so
V = M @ P
but that results in a 3D tensor where V[i, j] = M[j] @ P[i]
.
np.diagonal(M @ P).T
is basically what I want, but calculating it like that wastes a lot of computation.
Upvotes: 0
Views: 119