Reputation: 123
I am attempting to determine if an algorithm that I wrote operates in polynomial time and, at the moment I can't figure out how count to this function that it uses
def combo(list, size):
if size == 0 or not list: # order doesn't matter
return [list[:0]] # xyz == yzx
else:
result = []
for i in range(0, (len(list) - size) + 1): # iff enough left
pick = list[i:i+1]
rest = list[i+1:] # drop [:i] part
for x in combo(rest, size - 1):
result.append(pick + x)
return result
Upvotes: 1
Views: 182
Reputation: 42421
You've got an algorithm for "k-combinations": given n items, select k of them, treating ordering as irrelevant. From the ancients, we know how many combinations to expect:
n!
-----------
(n - k)! k!
For a given n (for example, 10), that expression is maximized when k equals half of n (5). As either n or k approach the extremes, the number of combinations gets much smaller.
With a little bit of reorganizing and simplifying, we can rewrite your code so that the number of calls to combos()
is roughly equal to the number of combinations in the worst case. Interestingly, the number of calls and the number of combinations have a nicely symmetrical inverse relationship.
Most important is that both are bounded above by the formula shown above for the worst case. That effectively is the O()
bound that you're asking for. But maybe not exactly, because the rewritten code makes fewer subroutine calls than your code, even though they do produce the same results. The short-circuiting logic in the example below prevents the extra calls and thus allows the worst-case argument to operate cleanly.
If that formula is your worst-case bound, does your algorithm run in polynomial time? I'm more intuitive than expert on such matters, but I think the answer is no. The worst case is when k = n / 2
, which gives you the following simplification. Even though the denominator gets really big really fast, it pales in comparison to the Chuck-Norris rate of growth of the numerator.
n!
-------------
(n/2)! (n/2)!
# For example, when n = 40.
product(1..40) product( 21..40) # Eat my dust, Homer!
----------------------------- = ---------------------
product(1..20) product(1..20) product(1..20 ) # Doh!
# Q.E.D.
An empirical illustration for many values of n and k:
from itertools import combinations
from math import factorial
n_calls = 0
def combos(vals, size):
# Track the number of calls.
global n_calls
n_calls += 1
# Basically your algorithm, but simplified
# and written as a generator.
for i in range(0, len(vals) - size + 1):
v = vals[i]
if size == 1:
yield [v]
else:
for c in combos(vals[i+1:], size - 1):
yield [v] + c
def expected_n(n, k):
# The mathematical formula for expected N of k-combinations.
return factorial(n) / ( factorial(n - k) * factorial(k) )
def main():
global n_calls
# Run through a bunch of values for n and k.
max_n = 15
for n in range(1, max_n + 1):
# Worst case is when k is half of n.
worst_case = expected_n(n, n // 2)
for k in range(1, n + 1):
# Get the combos and count the calls.
n_calls = 0
vs = list(range(n))
cs = list(combos(vs, k))
# Our result agrees with:
# - itertools.combinations
# - the math
# - the worst-case analysis
assert cs == list(list(c) for c in combinations(vs, k))
assert len(cs) == expected_n(n, k)
assert n_calls <= worst_case
assert len(cs) <= worst_case
# Inspect the numbers for one value of n.
if n == max_n:
print [n, k, len(cs), n_calls]
main()
Output:
[15, 1, 15, 1]
[15, 2, 105, 15]
[15, 3, 455, 105]
[15, 4, 1365, 455]
[15, 5, 3003, 1365]
[15, 6, 5005, 3003]
[15, 7, 6435, 5005]
[15, 8, 6435, 6435]
[15, 9, 5005, 6435]
[15, 10, 3003, 5005]
[15, 11, 1365, 3003]
[15, 12, 455, 1365]
[15, 13, 105, 455]
[15, 14, 15, 105]
[15, 15, 1, 15]
Upvotes: 1
Reputation: 30230
It depends on your size
variable.
If n is the length of your list list (shadowing here, btw).
For size = 1
, you're looking at n calls to combo
.
If we define a function f(n) = 1 + 2 + 3 + ... + (n-1),
... for size = 2
, you're looking at f(n) function calls.
If we define a function g(n) = f(1) + f(2) + f(3) + ... + f(n-1),
... for size = 3
, you're looking at g(n) function calls.
So it seems you could say that your function's complexity is bounded by O(n ^ s), where n is the length of your list and s is your size parameter.
Upvotes: 0
Reputation: 544
Take a look at the Run Snake Run profile viewer. It takes a profile output and creates a nice visualization of the function calls.
You run your program with the cProfile module and then send the output log to Run Snake Run:
python -m cProfile -o profile.log your_program.py
runsnake profile.log
That example is for Linux; Windows usage probably varies slightly.
Upvotes: 0