popcorndude
popcorndude

Reputation: 123

Determine computational complexity of recursive algorithm

I am attempting to determine if an algorithm that I wrote operates in polynomial time and, at the moment I can't figure out how count to this function that it uses

def combo(list, size):
    if size == 0 or not list:                            # order doesn't matter
        return [list[:0]]                                # xyz == yzx
    else:
        result = []
        for i in range(0, (len(list) - size) + 1):       # iff enough left
            pick = list[i:i+1]
            rest = list[i+1:]                            # drop [:i] part
            for x in combo(rest, size - 1):
                result.append(pick + x)
        return result

Upvotes: 1

Views: 182

Answers (3)

FMc
FMc

Reputation: 42421

You've got an algorithm for "k-combinations": given n items, select k of them, treating ordering as irrelevant. From the ancients, we know how many combinations to expect:

    n!
-----------
(n - k)! k!

For a given n (for example, 10), that expression is maximized when k equals half of n (5). As either n or k approach the extremes, the number of combinations gets much smaller.

With a little bit of reorganizing and simplifying, we can rewrite your code so that the number of calls to combos() is roughly equal to the number of combinations in the worst case. Interestingly, the number of calls and the number of combinations have a nicely symmetrical inverse relationship.

Most important is that both are bounded above by the formula shown above for the worst case. That effectively is the O() bound that you're asking for. But maybe not exactly, because the rewritten code makes fewer subroutine calls than your code, even though they do produce the same results. The short-circuiting logic in the example below prevents the extra calls and thus allows the worst-case argument to operate cleanly.

If that formula is your worst-case bound, does your algorithm run in polynomial time? I'm more intuitive than expert on such matters, but I think the answer is no. The worst case is when k = n / 2, which gives you the following simplification. Even though the denominator gets really big really fast, it pales in comparison to the Chuck-Norris rate of growth of the numerator.

      n!
-------------
(n/2)! (n/2)!

# For example, when n = 40.

       product(1..40)             product(      21..40)   # Eat my dust, Homer!
-----------------------------  =  ---------------------
product(1..20) product(1..20)     product(1..20       )   # Doh!

# Q.E.D.

An empirical illustration for many values of n and k:

from itertools import combinations
from math import factorial

n_calls = 0

def combos(vals, size):
    # Track the number of calls.
    global n_calls
    n_calls += 1

    # Basically your algorithm, but simplified
    # and written as a generator.
    for i in range(0, len(vals) - size + 1):
        v = vals[i]
        if size == 1:
            yield [v]
        else:
            for c in combos(vals[i+1:], size - 1):
                yield [v] + c

def expected_n(n, k):
    # The mathematical formula for expected N of k-combinations.
    return factorial(n) / ( factorial(n - k) * factorial(k) )

def main():
    global n_calls

    # Run through a bunch of values for n and k.
    max_n = 15
    for n in range(1, max_n + 1):
        # Worst case is when k is half of n.
        worst_case = expected_n(n, n // 2)

        for k in range(1, n + 1):
            # Get the combos and count the calls.
            n_calls = 0
            vs = list(range(n))
            cs = list(combos(vs, k))

            # Our result agrees with:
            #   - itertools.combinations
            #   - the math
            #   - the worst-case analysis
            assert cs      == list(list(c) for c in combinations(vs, k))
            assert len(cs) == expected_n(n, k)
            assert n_calls <= worst_case
            assert len(cs) <= worst_case

            # Inspect the numbers for one value of n.
            if n == max_n:
                print [n, k, len(cs), n_calls]

main()

Output:

[15, 1, 15, 1]
[15, 2, 105, 15]
[15, 3, 455, 105]
[15, 4, 1365, 455]
[15, 5, 3003, 1365]
[15, 6, 5005, 3003]
[15, 7, 6435, 5005]
[15, 8, 6435, 6435]
[15, 9, 5005, 6435]
[15, 10, 3003, 5005]
[15, 11, 1365, 3003]
[15, 12, 455, 1365]
[15, 13, 105, 455]
[15, 14, 15, 105]
[15, 15, 1, 15]

Upvotes: 1

jedwards
jedwards

Reputation: 30230

It depends on your size variable.

If n is the length of your list list (shadowing here, btw).

For size = 1, you're looking at n calls to combo.

If we define a function f(n) = 1 + 2 + 3 + ... + (n-1),

... for size = 2, you're looking at f(n) function calls.

If we define a function g(n) = f(1) + f(2) + f(3) + ... + f(n-1),

... for size = 3, you're looking at g(n) function calls.

So it seems you could say that your function's complexity is bounded by O(n ^ s), where n is the length of your list and s is your size parameter.

Upvotes: 0

bojangler
bojangler

Reputation: 544

Take a look at the Run Snake Run profile viewer. It takes a profile output and creates a nice visualization of the function calls.

You run your program with the cProfile module and then send the output log to Run Snake Run:

python -m cProfile -o profile.log your_program.py
runsnake profile.log

That example is for Linux; Windows usage probably varies slightly.

Upvotes: 0

Related Questions