Reputation: 1245
I have ten 6000 x 784
matrices. If I run np.cov and then pinv on each of them, the total time is:
%timeit for n in range(10): pinv(np.cov(A,rowvar = False))
1.64 s ± 78.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
If I instead loop the np.cov and compute the pinv of the 3D stack, I get:
%timeit for n in range(10): dd[n] = np.cov(A,rowvar = False)
485 ms ± 18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit pinv(dd)
4.59 s ± 369 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Why is pinv on the 3D so slow?
What can I do to have to compute np.cov of a stack and a corresponding pinv?
Upvotes: 2
Views: 200
Reputation: 765
This is due to the implementation. The limiting step is the numpy matmul
. First I thought the SVD is the limiting step, but this scales linearily (~130 ms per iteration, 1.3 s for the pack of ten).
Here are some parts of the profiling:
1 call (your first snippet)
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.131 0.131 0.131 0.131 linalg.py:1299(svd)
1 0.011 0.011 0.011 0.011 {built-in method numpy.core.multiarray.matmul}
1 0.003 0.003 0.144 0.144 <ipython-input-76-2a63f1c84429>:1(pinv)
1 call of a pack of 10 (your second snippet)
ncalls tottime percall cumtime percall filename:lineno(function)
1 2.952 2.952 2.952 2.952 {built-in method numpy.core.multiarray.matmul}
1 1.265 1.265 1.265 1.265 linalg.py:1299(svd)
1 0.026 0.026 4.266 4.266 <ipython-input-76-2a63f1c84429>:1(pinv)
Maybe there is an internal issue of the C-ordering (but recent numpy versions should take care of it), or it is just inefficient in this case... Tried briefly to change to the BLAS interface directly using the scipy implementation, but this operates on two-dimensional matrices only. These issues (that matmul is slow) has been actually raised elsewhere.
Bottom line, do it non-vectorized or try to use other implementations. Sometimes the fastet way for matrix multiplication is to use the eigen
functions, but I never tried it in a multi-dimensional problem.
Upvotes: 1