Mateo de Mayo
Mateo de Mayo

Reputation: 939

Why do these constant functions performance differ?

In the following snippet, why is py_sqrt2 almost twice as fast as np_sqrt2?

from time import time
from numpy import sqrt as npsqrt
from math import sqrt as pysqrt

NP_SQRT2 = npsqrt(2.0)
PY_SQRT2 = pysqrt(2.0)

def np_sqrt2():
    return NP_SQRT2

def py_sqrt2():
    return PY_SQRT2

def main():
    samples = 10000000

    it = time()
    E = sum(np_sqrt2() for _ in range(samples)) / samples
    print("executed {} np_sqrt2's in {:.6f} seconds E={}".format(samples, time() - it, E))

    it = time()
    E = sum(py_sqrt2() for _ in range(samples)) / samples
    print("executed {} py_sqrt2's in {:.6f} seconds E={}".format(samples, time() - it, E))


if __name__ == "__main__":
    main()

$ python2.7 snippet.py 
executed 10000000 np_sqrt2's in 1.380090 seconds E=1.41421356238
executed 10000000 py_sqrt2's in 0.855742 seconds E=1.41421356238
$ python3.6 snippet.py 
executed 10000000 np_sqrt2's in 1.628093 seconds E=1.4142135623841212
executed 10000000 py_sqrt2's in 0.932918 seconds E=1.4142135623841212

Notice that they are constant functions that just load from precomputed globals with the same value, and that the constants only differ in how they were computed at program start.

Moreover the disassembly of these functions show that they do as expected and only access the global constants.

In [73]: dis(py_sqrt2)                                                                                                                                                                            
  2           0 LOAD_GLOBAL              0 (PY_SQRT2)
              2 RETURN_VALUE

In [74]: dis(np_sqrt2)                                                                                                                                                                            
  2           0 LOAD_GLOBAL              0 (NP_SQRT2)
              2 RETURN_VALUE

Upvotes: 0

Views: 132

Answers (2)

Mateo de Mayo
Mateo de Mayo

Reputation: 939

After running perf record on two versions of the script one using only PY_SQRT2 and the other only NP_SQRT2 it seems that the one using the numpy constant is making extra calls.

This made me realize the two constants have different types:

In [4]: type(PY_SQRT2)                                                          
Out[4]: float

In [5]: type(NP_SQRT2)                                                          
Out[5]: numpy.float64

And so operating with sum on (and maybe loading?) numpy.float64s is slower than native floats.

This answer helped as well.

Upvotes: 0

Joran Beasley
Joran Beasley

Reputation: 113988

because you are sending it to c every time for just one value

try the following instead

t0=time.time()
numpy.sqrt([2]*10000)
t1 = time.time()
print("Took %0.3fs to do 10k sqrt(2)"%(t1-t0))

t0 = time.time()
for i in range(10000):
    numpy.sqrt(2)
t1 = time.time()
print("Took %0.3fs to do 10k math.sqrt(2)"%(t1-t0))

Upvotes: 2

Related Questions