Sum over squared array

Question

As part of a batch Euclidean distance computation, I'm computing

(X * X).sum(axis=1)

where X is a rather large 2-d array. This works fine, but it constructs a temporary array of the same size as X. Is there any way to get rid of this temporary, but retain the efficiency of a vectorized operation?

The obvious candidate,

np.array([np.dot(row, row) for row in X])

works, but uses a Python list as a temporary, making it rather slow.

Without the axis, the memory-efficient form would be

(X * X).sum()  =>  np.dot(X.ravel(), X.ravel())

and I know that, when axis=1, it's equivalent to

np.diag(np.dot(X, X.T))

which got me looking into generalizations of dot such as np.inner, np.tensordot and np.einsum, but I can't figure out how they would solve my problem.

YXD · Accepted Answer

The einsum equivalent is:

np.einsum('ij,ij->i', X, X)

Though I am not sure how this works internally so it may or may not not solve your problem.

Sum over squared array

Answers (1)

Related Questions