Reputation: 93
Why is math.factorial
so much faster than scipy.special.factorial
?
import timeit
t = timeit.timeit("from math import factorial; factorial(20)"); print(t)
0.6399730000412092
t = timeit.timeit("from scipy.special import factorial; factorial(20)"); print(t)
5.339432950946502
t = timeit.timeit("from scipy.special import factorial; factorial(20, exact=True)"); print(t)
1.7984685270348564
I am on Python 3.7 (scipy version is 1.1.0)
Upvotes: 1
Views: 1145
Reputation: 13175
This is a common mistake, similar to expecting things like np.exp()
to work faster than from the math
module. This is not the purpose of such functions. The scientific stack (NumPy
, Pandas
, SciPy
and others) is concerned with vectorized approaches across arrays, not single values.
from math import factorial
factorial([20, 20, 20])
This will give TypeError: an integer is required (got type list)
But:
from scipy.special import factorial
factorial([20, 20, 20])
Will compute the factorial for the whole list, giving:
array([2.43290201e+18, 2.43290201e+18, 2.43290201e+18])
If you dropped your math.factorial
calculation into a for
loop to cover multiple items in the list, it will very quickly fall behind on timings vs. the vectorized approach (which would be even faster if you supplied a NumPy array rather than a list in the first place)
Upvotes: 5
Reputation: 51643
Beside the already mentioned facts about scipy targetting vectorized inputs - you use a very small test case here - if you widen it, it wont be as clear:
import timeit
for z in range(20,100000,10000):
t1 = timeit.timeit(f"from math import factorial; factorial({z})",number=10)
t2 = timeit.timeit(f"from scipy.special import factorial; factorial({z})",number=10)
t3 = timeit.timeit(f"from scipy.special import factorial; factorial({z}, exact=True)",number=10)
print(f"{z} : {t1:<20} {t2:<20} {t3:<20}")
Output:
# facNr factorial scipy.special.factorial(*) exact = True
20 : 0.0003352240000822 0.18152283800009172 6.924199988134205e-05
10020 : 0.0368837539999731 0.00016821899953356478 0.03690350099986972
20020 : 0.1258954189997894 0.00016980899999907706 0.12552139000035822
30020 : 0.2532270950005113 0.00017434100027458044 0.2531732979996377
40020 : 0.4068329990004713 0.00017545999980939087 0.406938215000082
50020 : 0.6163399690003644 0.0001782059998731711 0.616294079999534
60020 : 0.8626650409996728 0.00017887300055008382 0.8635997929995938
70020 : 1.1321934719999263 0.00017422100063413382 1.130675204999534
80020 : 1.369009857000492 0.00017408599978807615 1.369635687000482
90020 : 1.7379734959995403 0.00017380499957653228 1.7343564000002516
By using a wider range of inputs you see that it is not always much faster.
If doing timings, always consider edgecases of few and lots of data together with "normal amount"
(*) returns inf
for z
>170 - so not calculating it in earnest and timings are skewed
Upvotes: 1
Reputation: 2557
math.factorial
is a python builtin function, it does not get compiled to python byte code, and is actually executing c
code.
You can see similar behavior with other functions, such as numpy
cos
/ sin
etc...
Upvotes: 0
Reputation: 1177
When exact=True
the scipy function is really just acting as a wrapper around math.factorial
(see https://github.com/scipy/scipy/blob/v1.2.0/scipy/special/…) so I guess the extra time is just the overhead of the argument checking logic and extra function call.
When exact=False
or unspecified you're using an approximation, which will be slower than the exact calculation for small values of n
.
Upvotes: 0