varantir
varantir

Reputation: 6854

Why is numpy.dot much faster than numpy.einsum?

I have numpy compiled with OpenBlas and I am wondering why einsum is much slower than dot (I understand in the 3 indices case, but I dont understand why it is also less performant in the two indices case)? Here an example:

import numpy as np
A = np.random.random([1000,1000])
B = np.random.random([1000,1000])

%timeit np.dot(A,B)

Out: 10 loops, best of 3: 26.3 ms per loop

%timeit np.einsum("ij,jk",A,B)

Out: 5 loops, best of 3: 477 ms per loop

Is there a way to let einsum use OpenBlas and parallelization like numpy.dot? Why does np.einsum not just call np.dot if it notices a dot product?

Upvotes: 6

Views: 2929

Answers (1)

hpaulj
hpaulj

Reputation: 231355

einsum parses the index string, and then constructs an nditer object, and uses that to perform a sum-of-products iteration. It has special cases where the indexes just perform axis swaps, and sums ('ii->i'). It may also have special cases for 2 and 3 variables (as opposed to more). But it does not make any attempt to invoke external libraries.

I worked out a pure python work-a-like, but with more focus on the parsing than the calculation special cases.

tensordot reshapes and swaps, so it can then call dot to the actual calculations.

Upvotes: 3

Related Questions