Andrew Latham
Andrew Latham

Reputation: 6132

NumPy tensordot MemoryError

I have two matrices -- A is 3033x3033, and X is 3033x20. I am running the following lines (as suggested in the answer to another question I asked):

n, d = X.shape
c = X.reshape(n, -1, d) - X.reshape(-1, n, d)
return np.tensordot(A.reshape(n, n, -1) * c, c, axes=[(0,1),(0,1)])

On the final line, Python simply stops and says "MemoryError". How can I get around this, either by changing some setting in Python or performing this operation in a more memory-efficient way?

Upvotes: 3

Views: 1048

Answers (3)

Will Martin
Will Martin

Reputation: 1023

Here is a function that does the calculation without any for loops and without any large temporary array. See the related question for a longer answer, complete with a test script.

def fbest(A, X):
  ""
  KA_best = np.tensordot(A.sum(1)[:,None] * X, X, axes=[(0,), (0,)])
  KA_best += np.tensordot(A.sum(0)[:,None] * X, X, axes=[(0,), (0,)])
  KA_best -= np.tensordot(np.dot(A, X), X, axes=[(0,), (0,)])
  KA_best -= np.tensordot(X, np.dot(A, X), axes=[(0,), (0,)])
  return KA_best

I profiled the code with your size arrays: enter image description here

I love sp.einsum by the way. It is a great place to start when speeding up array operations by removing for loops. You can do SOOOO much with one call to sp.einsum.

The advantage of np.tensordot is that it links to whatever fast numerical library you have installed (i.e. MKL). So, tensordot will run faster and in parallel when you have the right libraries installed.

Upvotes: 3

Divakar
Divakar

Reputation: 221554

You can employ a two-nested loop format iterating along the last dimension of X. Now, that last dimension is 20, so hopefully it would still be efficient enough and more importantly leave minimum memory footprint. Here's the implementation -

n, d = X.shape
c = X.reshape(n, -1, d) - X.reshape(-1, n, d)
out = np.empty((d,d))  # d is a small number: 20
for i in range(d):
    for j in range(d):
        out[i,j] = (A*c[:,:,i]*(c[:,:,j])).sum()

 return out

You can replace the last line with np.einsum -

out[i,j] = np.einsum('ij->',A*c[:,:,i]*c[:,:,j])    

Upvotes: 0

DavidW
DavidW

Reputation: 30889

If you replace the final line with

return np.einsum('ij,ijk,ijl->kl',A,c,c)

you avoid creating the A.reshape(n, n, -1) * c (3301 by 3301 by 20) intermediate that I think is your main problem.

My impression is that the version I give is probably slower (for cases where it doesn't run out of memory), but I haven't rigourously timed it.

It's possible you could go further and avoid creating c, but I can't immediately see how to do it. It'd be a case of following writing the whole thing in terms of sums of matrix indicies and seeing what it simplified to.

Upvotes: 2

Related Questions