Reputation: 381
I have a matrix A
, shaped (N, D, 4)
. First I calculate A
transposed, A_t
. I want to calculate the product of A_t
times A
. I want the resulting matrix to be shaped (D, D)
, and the product of the matrices be like if the last vector of 4 components was a number. (The dot product of two vectors is a number.)
import numpy as np
N = 15
D = 98
A = np.random.random((N, D, 4))
A_t = np.zeros((D, N, 4))
for i in range(N):
A_t[:, i] = A[i]
S = np.zeros((D, D))
for i in range(D):
row = A_t[i]
for j in range(D):
col = A[:, j, :]
val = 0
for n in range(N):
val += np.matmul(row[n], col[n])
S[i][j] = val
print(A.shape)
print(A_t.shape)
print(S.shape)
Upvotes: 1
Views: 3070
Reputation: 114230
Let's walk through the operations you are attempting and see what we can do to simplify them. For starters, you can write
A_t = np.swapaxes(A, 0, 1)
This is equivalent to
A_t = np.transpose(A, [0, 1, 2])
or
A_t = A.transpose([0, 1, 2])
As it happens, neither is necessary for your current application. To see why, let's work with a simplified example:
np.random.seed(42)
N = 4
D = 3
K = 2
A = np.random.randint(0, 10, (N, D, K))
In your outer loop, you have row = A_t[i]
. But by definition of your transpose, this is identical to row = A[:, i, :]
, making your life much easier, and the transpose superfluous.
The inner loop sums some dot products:
val = 0
for n in range(N):
val += np.matmul(row[n], col[n])
If you remember the definition of a dot product, you will see that you are doing the equivalent of
np.sum(np.sum(row * col, axis=1), axis=0)
The inner sum is the sum-product in your loop, while the outer sum is the computation of val
. Summing across both dimensions separately is just the same as summing the entire buffer at once, so we can immediately replace the inner loop with just
for i in range(D):
for j in range(D):
S[i][j] = np.sum(A[:, i, :] * A[:, j, :])
You can simplify this with either np.dot
, np.tensordot
, np.einsum
, or just plain broadcasting. The first two are needlessly complicated because you are really sum-multiplying over two dimensions simultaneously. np.einsum
offers the most direct solution overall, but it is a less straightforward translation of your code.
Solution 1: Broadcasting
Let's start with a straightforward broadcasting version of the double loop, before moving on to the more idiomatic solution:
S = (A[:, None, ...] * A[:, :, None, ...]).sum(axis=(0, -1))
or
S = np.sum(A[:, None, ...] * A[:, :, None, ...], axis=(0, -1))
This creates views of A
shaped (N, 1, D, K)
and (N, D, 1, K)
respectively. The multiplication broadcasts the replicated D
axes in each case to exactly what the for
loops do, so the final sum over the N
and K
axes does exactly what the line S[i][j] = np.sum(A[:, i, :] * A[:, j, :])
did before.
Solution 2: np.einsum
This solution lets you apply sum-product directly to whatever axes you want:
S = np.einsum('ijk,ihk->jh', A, A)
Notice that you must use a different letter for the second axis of the second matrix (j
and h
), to indicate that you will not be summing over that axis. S
is symmetrical, but if it were not, you could transpose it by transposing to ->hj
in the result.
Upvotes: 1