thedumbkid
thedumbkid

Reputation: 125

How do I multiply tensors like this?

I am working on a project where I need to multiply 2 tensors which look like this. The first tensor has 5 matrices and the second one has 5 column vectors. I need to multiply these two to get the resultant tensor such that each element of that tensor is the column vector I get after multiplying the corresponding matrix by the corresponding column vector. `

tensor([[[8.1776, 0.6560],
         [0.6560, 2.3653]],

        [[8.1776, 0.6560],
         [0.6560, 2.3104]],

        [[8.9871, 0.6560],
         [0.6560, 2.2535]],

        [[1.3231, 0.6560],
         [0.6560, 2.3331]],

        [[4.8677, 0.6560],
         [0.6560, 2.2935]]], grad_fn=<AddBackward0>)

tensor([[-0.1836, -0.9153],
        [-0.1836, -0.8057],
        [-0.2288, -0.6442],
        [ 0.1017, -0.8555],
        [-0.0175, -0.7637]], grad_fn=<AddBackward0>)

` I simple @ or * does not work. What should I do? I need to call backward and hence cannot lose the gradients.

I tried other functions like @ or * and looked up some docs like torch.split but none of them really worked.

Upvotes: 0

Views: 1758

Answers (2)

physicsprune
physicsprune

Reputation: 26

You should familiarize yourself with the tensor product if you want to multiply tensors (https://pytorch.org/docs/stable/generated/torch.tensordot.html). I hope I understand correctly of what it is you are trying to achieve here. Let's say your first array is called a with shape (5, 2, 2) and your second array is called b with shape (5, 2).

There is a very short solution to your problem using einsum:

result = einsum('ijk, ik -> ij', a, b)

will give you the desired result.

However, to make clear what is happening, I'll provide a lengthier version. To achieve a matrix vector multiplication of a and b, you first need to sum over the last axis of a and over the second axis of b.

c = tensordot(a, b, dims = ([-1], [1]))

c has shape (5, 2, 5) this is because you took all possible combinations of multiplying each of the (2, 2) matrices of a with the columns of b. This means that:

  • c[0, :, 0] gives the matrix multiplication of a[0] with b[0]
  • c[1, :, 0] gives the matrix multiplication of a[1] with b[0]
  • c[1, :, 1] gives the matrix multiplication of a[1] with b[1]
  • ...

However, if you are not interested in the "mixed" terms, only in the diagonal entries, then:

result = diagonal(c, offset=0, dim1=0, dim2=2)

Just note that the diagonal entries are put to the last dimension, you will still need to transpose the result to get shape (5, 2).

I hope this helps!

Upvotes: 1

Gil Pinsky
Gil Pinsky

Reputation: 2493

It seems like what you are trying to do is best done with torch.einsum which is a function allowing you to perform custom product-summation operations.

Say first tensor is named t1 & second tensor is named t2, then to obtain a matrix-vector multiplication, resulting in a 5x5x2 shaped tensor, you should use the following command:

torch.einsum('bij,ci->bcj', t1, t2)

The first string argument defines the product-summation operation. I suggest you read more about it here (it's the NumPy's equivalent einsum operation but the format is similar):

Understanding NumPy's einsum

Upvotes: 1

Related Questions