Python Numpy matrix multiplication in high dimension

Question

I am trying to look for a matrix operation in numpy that would speed up the following calculation.

I have two 3D matrices A and B. the first dimension indicates the example, and both of them have n_examples examples. What I want to achieve is to dot product each example in A and B and sum the result:

import numpy as np

n_examples = 10
A = np.random.randn(n_examples, 20,30)
B = np.random.randn(n_examples, 30,5)
sum = np.zeros([20,5])
for i in range(len(A)):
  sum += np.dot(A[i],B[i])

Saullo G. P. Castro · Accepted Answer

This is a typical application for np.tensordot():

sum = np.tensordot(A, B, [[0,2],[0,1]])

Timing

Using the following code:

import numpy as np

n_examples = 100
A = np.random.randn(n_examples, 20,30)
B = np.random.randn(n_examples, 30,5)

def sol1():
    sum = np.zeros([20,5])
    for i in range(len(A)):
      sum += np.dot(A[i],B[i])
    return sum

def sol2():
    return np.array(map(np.dot, A,B)).sum(0)

def sol3():
    return np.einsum('nmk,nkj->mj',A,B)

def sol4():
    return np.tensordot(A, B, [[2,0],[1,0]])

def sol5():
    return np.tensordot(A, B, [[0,2],[0,1]])

Results:

timeit sol1()
1000 loops, best of 3: 1.46 ms per loop

timeit sol2()
100 loops, best of 3: 4.22 ms per loop

timeit sol3()
1000 loops, best of 3: 1.87 ms per loop

timeit sol4()
10000 loops, best of 3: 205 µs per loop

timeit sol5()
10000 loops, best of 3: 172 µs per loop

on my computer the tensordot() was the fastest solution and changing the order that the axes are evaluated did not change the results neither the performance.

Python Numpy matrix multiplication in high dimension

Answers (2)

Related Questions