Reputation: 3561
I have written two functions to calculate combinations. The first one uses a for loop, the other one uses a recursive factorial function. Why is the first faster than the second?
def combinations(n: int, k: int) -> int:
# Collection >= Selection
if n < k:
raise ValueError(
"The size of the collection we are selecting items from must be "
"larger than the size of the selection."
)
# Sizes > 0
if n < 0 or k < 0:
raise ValueError(
"Cannot work with negative integers."
)
# Compute with standard python only
numerator = 1
for i in range(n + 1 - k, n+1):
numerator *= i
denominator = 1
for i in range(1, k+1):
denominator *= i
return int(numerator / denominator)
The second function needs a factorial function defined as:
def factorial(n: int) -> int:
if n < 0:
raise ValueError(
"Cannot calculate factorial of a negative number."
)
# Recursive function up to n = 0
return n * factorial(n - 1) if n - 1 >= 0 else 1
And it is defined as:
def combinations2(n: int, k: int) -> int:
# Collection >= Selection
if n < k:
raise ValueError(
"The size of the collection we are selecting items from must be "
"larger than the size of the selection."
)
return int(factorial(n) / (factorial(k) * factorial(n - k)))
When I run the following test on IPython console, it is clear which one is faster
%timeit combinations(1000, 50)
16.2 µs ± 1.95 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
and
%timeit combinations2(1000, 50)
1.6 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
NEW VERSION OF COMBINATIONS2
Okay following the comments, I agree combinations2 is doing many more operations. So I rewrote both factorial and combinations function, here are their versions:
def factorial(n: int, lower: int=-1) -> int:
# n > 0
if n < 0:
raise ValueError(
"Cannot calculate factorial of a negative number."
)
# Recursive function up to n = 0 or up to lower bound
if n - 1 >= 0 and n - 1 >= lower:
return n * factorial(n - 1, lower)
return 1
which now can have a lower bound. Notice that in general factorial(a, b) = factorial(a) / factorial(b). Also, here is the new version of the combinations2 function:
def combinations2(n: int, k: int) -> int:
if n < k:
raise ValueError(
"The size of the collection we are selecting items from must be "
"larger than the size of the selection."
)
return int(factorial(n, n - k) / factorial(k))
But again, this is their comparison:
%timeit combinations(100, 50)
10.5 µs ± 1.67 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit combinations2(100, 50)
56.1 µs ± 5.79 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Upvotes: 4
Views: 1004
Reputation: 2090
Just count the number of operations:
In combinations
you are making (n+1) - (n+1-k)
multiplications for numerator, and (k+1) - 1
multiplications for denominator.
Total: 2k
multiplications
In cominations2
you are making n + k + (n-k)
multiplications, i.e. 2n
multiplications.
And you are making also 2n
function calls for recursion.
With k=50
and n=1000
, no wonder why the first solution is faster.
Upvotes: 4