Chris Bamford
Chris Bamford

Reputation: 105

Achieving batch matrix multiply using tensordot

I'm trying to achieve the same behaviour as np.matmul parallel matrix multiplication using just tensordot,dot and reshaping etc.

The library I am translating this to using does not have a matmul that supports parallel multiplication, only dot and tensordot.

Additionally I want to avoid iterating over the first dimension, and want to do this using a set of matrix multiplications and reshaping (want as much of it to run using BLAS/GPU as i have large numbers of small matrices to calculate in parallel).

Here is an example:

import numpy as np

angles = np.array([np.pi/4, 2*np.pi/4, 2*np.pi/4])

vectors = np.array([ [1,0],[1,-1],[-1,0]])

s = np.sin(angles)
c = np.cos(angles)

rotations = np.array([[c,s],[-s,c]]).T

print rotations

print vectors

print("Correct: %s" % np.matmul(rotations, vectors.reshape(3,2,1)))

# I want to do this using tensordot/reshaping, i.e just gemm BLAS operations underneath
print("Wrong: %s" % np.tensordot(rotations, vectors, axes=(1,1)))

The output of this is:

Correct: [[[  7.07106781e-01]
  [  7.07106781e-01]]

 [[  1.00000000e+00]
  [  1.00000000e+00]]

 [[ -6.12323400e-17]
  [ -1.00000000e+00]]]


Wrong: [[[  7.07106781e-01   1.11022302e-16  -7.07106781e-01]
  [ -7.07106781e-01  -1.41421356e+00   7.07106781e-01]]

 [[  6.12323400e-17  -1.00000000e+00  -6.12323400e-17]
  [ -1.00000000e+00  -1.00000000e+00   1.00000000e+00]]

 [[  6.12323400e-17  -1.00000000e+00  -6.12323400e-17]
  [ -1.00000000e+00  -1.00000000e+00   1.00000000e+00]]]

Is there a way in which I can modify the second expression in order to get the same result as the first, just using dot/tensordot.

I believe it is possible, and have seen some comments online, but never any examples

Upvotes: 2

Views: 2513

Answers (1)

Divakar
Divakar

Reputation: 221514

We need to keep one aligned and keep that also at the output. So, tensordot/dot won't work here. More info on tensordot might explain it somehow on why it won't. But, we can use np.einsum, which in most cases (in my experience) is seen to be marginally faster than np.matmul.

The implementation would look something like this -

np.einsum('ijk,ik->ij',rotations, vectors)

Also, it seems the desired output has one trailing singleton dim. So, append a new axis there with None/np.newaxis, like so -

np.einsum('ijk,ik->ij',rotations, vectors)[...,None]

Upvotes: 4

Related Questions