P. Camilleri
P. Camilleri

Reputation: 13218

Performance difference between scipy and numpy norm

I have always assumed scipy.linalg.norm() and numpy.linalg.norm() to be equivalent (scipy version used to not accept an axis argument, but now it does). However the following simple examples yields significantly different performances: what is the reason behind that?

In [1]: from scipy.linalg import norm as normsp
In [2]: from numpy.linalg import norm as normnp 
In [3]: import numpy as np
In [4]: a = np.random.random(size=(1000, 2000))

In [5]: %timeit normsp(a)
The slowest run took 5.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 2.85 ms per loop

In [6]: %timeit normnp(a)
The slowest run took 6.39 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 558 µs per loop

scipy version is 0.18.1, numpy is 1.11.1

Upvotes: 6

Views: 2298

Answers (1)

Vlas Sokolov
Vlas Sokolov

Reputation: 3893

Looking the source code reveals that scipy has its own norm function, which wraps around the numpy.linalg.norm or a BLAS function that is slower but handles floating point overflows better (see discussion on this PR).

However, in the example that you give it doesn't look like SciPy uses a BLAS function, so I do not think it's responsible for the time difference you see. But scipy does do some other checks before calling the numpy version of norm. In particular, that infinite check a = np.asarray_chkfinite(a) is a suspect for causing the performance difference:

In [103]: %timeit normsp(a)
100 loops, best of 3: 5.1 ms per loop

In [104]: %timeit normnp(a)
1000 loops, best of 3: 744 µs per loop

In [105]: %timeit np.asarray_chkfinite(a)
100 loops, best of 3: 4.13 ms per loop

So it looks like np.asarray_chkfinite roughly accounts for the difference in time taken to evaluate the norms.

Upvotes: 6

Related Questions