undy
undy

Reputation: 93

comparing performance of python factorial (math vs. scipy)

Why is math.factorial so much faster than scipy.special.factorial?

import timeit

t = timeit.timeit("from math import factorial; factorial(20)"); print(t)
0.6399730000412092

t = timeit.timeit("from scipy.special import factorial; factorial(20)"); print(t)
5.339432950946502

t = timeit.timeit("from scipy.special import factorial; factorial(20, exact=True)"); print(t)
1.7984685270348564

I am on Python 3.7 (scipy version is 1.1.0)

Upvotes: 1

Views: 1145

Answers (4)

roganjosh
roganjosh

Reputation: 13175

This is a common mistake, similar to expecting things like np.exp() to work faster than from the math module. This is not the purpose of such functions. The scientific stack (NumPy, Pandas, SciPy and others) is concerned with vectorized approaches across arrays, not single values.

from math import factorial

factorial([20, 20, 20])

This will give TypeError: an integer is required (got type list)

But:

from scipy.special import factorial

factorial([20, 20, 20])

Will compute the factorial for the whole list, giving:

array([2.43290201e+18, 2.43290201e+18, 2.43290201e+18])

If you dropped your math.factorial calculation into a for loop to cover multiple items in the list, it will very quickly fall behind on timings vs. the vectorized approach (which would be even faster if you supplied a NumPy array rather than a list in the first place)

Upvotes: 5

Patrick Artner
Patrick Artner

Reputation: 51643

Beside the already mentioned facts about scipy targetting vectorized inputs - you use a very small test case here - if you widen it, it wont be as clear:

import timeit

for z in range(20,100000,10000):
    t1 = timeit.timeit(f"from math import factorial; factorial({z})",number=10)
    t2 = timeit.timeit(f"from scipy.special import factorial; factorial({z})",number=10)
    t3 = timeit.timeit(f"from scipy.special import factorial; factorial({z}, exact=True)",number=10)
    print(f"{z}  : {t1:<20}  {t2:<20}  {t3:<20}")

Output:

# facNr    factorial          scipy.special.factorial(*)   exact = True
   20  : 0.0003352240000822    0.18152283800009172     6.924199988134205e-05
10020  : 0.0368837539999731    0.00016821899953356478  0.03690350099986972 
20020  : 0.1258954189997894    0.00016980899999907706  0.12552139000035822 
30020  : 0.2532270950005113    0.00017434100027458044  0.2531732979996377  
40020  : 0.4068329990004713    0.00017545999980939087  0.406938215000082   
50020  : 0.6163399690003644    0.0001782059998731711   0.616294079999534   
60020  : 0.8626650409996728    0.00017887300055008382  0.8635997929995938  
70020  : 1.1321934719999263    0.00017422100063413382  1.130675204999534   
80020  : 1.369009857000492     0.00017408599978807615  1.369635687000482   
90020  : 1.7379734959995403    0.00017380499957653228  1.7343564000002516  

By using a wider range of inputs you see that it is not always much faster.

If doing timings, always consider edgecases of few and lots of data together with "normal amount"


(*) returns inf for z>170 - so not calculating it in earnest and timings are skewed

Upvotes: 1

Dinari
Dinari

Reputation: 2557

math.factorial is a python builtin function, it does not get compiled to python byte code, and is actually executing c code.

You can see similar behavior with other functions, such as numpy cos / sin etc...

Upvotes: 0

Joe Halliwell
Joe Halliwell

Reputation: 1177

When exact=True the scipy function is really just acting as a wrapper around math.factorial (see https://github.com/scipy/scipy/blob/v1.2.0/scipy/special/…) so I guess the extra time is just the overhead of the argument checking logic and extra function call.

When exact=False or unspecified you're using an approximation, which will be slower than the exact calculation for small values of n.

Upvotes: 0

Related Questions