How do I multiply tensors like this?

Question

I am working on a project where I need to multiply 2 tensors which look like this. The first tensor has 5 matrices and the second one has 5 column vectors. I need to multiply these two to get the resultant tensor such that each element of that tensor is the column vector I get after multiplying the corresponding matrix by the corresponding column vector. `

tensor([[[8.1776, 0.6560],
         [0.6560, 2.3653]],

        [[8.1776, 0.6560],
         [0.6560, 2.3104]],

        [[8.9871, 0.6560],
         [0.6560, 2.2535]],

        [[1.3231, 0.6560],
         [0.6560, 2.3331]],

        [[4.8677, 0.6560],
         [0.6560, 2.2935]]], grad_fn=)

tensor([[-0.1836, -0.9153],
        [-0.1836, -0.8057],
        [-0.2288, -0.6442],
        [ 0.1017, -0.8555],
        [-0.0175, -0.7637]], grad_fn=)

` I simple @ or * does not work. What should I do? I need to call backward and hence cannot lose the gradients.

I tried other functions like @ or * and looked up some docs like torch.split but none of them really worked.

physicsprune · Accepted Answer

You should familiarize yourself with the tensor product if you want to multiply tensors (https://pytorch.org/docs/stable/generated/torch.tensordot.html). I hope I understand correctly of what it is you are trying to achieve here. Let's say your first array is called a with shape (5, 2, 2) and your second array is called b with shape (5, 2).

There is a very short solution to your problem using einsum:

result = einsum('ijk, ik -> ij', a, b)

will give you the desired result.

However, to make clear what is happening, I'll provide a lengthier version. To achieve a matrix vector multiplication of a and b, you first need to sum over the last axis of a and over the second axis of b.

c = tensordot(a, b, dims = ([-1], [1]))

c has shape (5, 2, 5) this is because you took all possible combinations of multiplying each of the (2, 2) matrices of a with the columns of b. This means that:

c[0, :, 0] gives the matrix multiplication of a[0] with b[0]
c[1, :, 0] gives the matrix multiplication of a[1] with b[0]
c[1, :, 1] gives the matrix multiplication of a[1] with b[1]
...

However, if you are not interested in the "mixed" terms, only in the diagonal entries, then:

result = diagonal(c, offset=0, dim1=0, dim2=2)

Just note that the diagonal entries are put to the last dimension, you will still need to transpose the result to get shape (5, 2).

I hope this helps!

How do I multiply tensors like this?

Answers (2)

Related Questions