Reputation: 125
I am working on a project where I need to multiply 2 tensors which look like this. The first tensor has 5 matrices and the second one has 5 column vectors. I need to multiply these two to get the resultant tensor such that each element of that tensor is the column vector I get after multiplying the corresponding matrix by the corresponding column vector. `
tensor([[[8.1776, 0.6560],
[0.6560, 2.3653]],
[[8.1776, 0.6560],
[0.6560, 2.3104]],
[[8.9871, 0.6560],
[0.6560, 2.2535]],
[[1.3231, 0.6560],
[0.6560, 2.3331]],
[[4.8677, 0.6560],
[0.6560, 2.2935]]], grad_fn=<AddBackward0>)
tensor([[-0.1836, -0.9153],
[-0.1836, -0.8057],
[-0.2288, -0.6442],
[ 0.1017, -0.8555],
[-0.0175, -0.7637]], grad_fn=<AddBackward0>)
` I simple @ or * does not work. What should I do? I need to call backward and hence cannot lose the gradients.
I tried other functions like @ or * and looked up some docs like torch.split but none of them really worked.
Upvotes: 0
Views: 1758
Reputation: 26
You should familiarize yourself with the tensor product if you want to multiply tensors (https://pytorch.org/docs/stable/generated/torch.tensordot.html). I hope I understand correctly of what it is you are trying to achieve here. Let's say your first array is called a with shape (5, 2, 2) and your second array is called b with shape (5, 2).
There is a very short solution to your problem using einsum:
result = einsum('ijk, ik -> ij', a, b)
will give you the desired result.
However, to make clear what is happening, I'll provide a lengthier version. To achieve a matrix vector multiplication of a and b, you first need to sum over the last axis of a and over the second axis of b.
c = tensordot(a, b, dims = ([-1], [1]))
c has shape (5, 2, 5) this is because you took all possible combinations of multiplying each of the (2, 2) matrices of a with the columns of b. This means that:
c[0, :, 0]
gives the matrix multiplication of a[0]
with b[0]
c[1, :, 0]
gives the matrix multiplication of a[1]
with b[0]
c[1, :, 1]
gives the matrix multiplication of a[1]
with b[1]
However, if you are not interested in the "mixed" terms, only in the diagonal entries, then:
result = diagonal(c, offset=0, dim1=0, dim2=2)
Just note that the diagonal entries are put to the last dimension, you will still need to transpose the result to get shape (5, 2).
I hope this helps!
Upvotes: 1
Reputation: 2493
It seems like what you are trying to do is best done with torch.einsum
which is a function allowing you to perform custom product-summation operations.
Say first tensor is named t1
& second tensor is named t2
, then to obtain a matrix-vector multiplication, resulting in a 5x5x2 shaped tensor, you should use the following command:
torch.einsum('bij,ci->bcj', t1, t2)
The first string argument defines the product-summation operation. I suggest you read more about it here (it's the NumPy's equivalent einsum operation but the format is similar):
Upvotes: 1