Reputation: 45
I am trying to compute the complexity of this function taking into account the parameters: the length of 'string' and the length of the combined string - 'size'.
I don't know if the 'size' is an important factor to compute the complexity.
I know more or less that it's going to be an approximate value of the combination of len(string) and size: C(len(string), size).
But how can I compute it 'formally', or how can I represent it? O(C(len(string), size))?
Can someone help me? Thank you very much.
def comb(string, size, r=''):
if len(r) == size:
print(r)
return
for i, e in enumerate(string):
comb(string[i + 1:], size, r + e)
comb('abcde', 2)
comb('abcde', 3)
The output of the call comb('abcde', 2) is: ab ac ad ae bc bd be cd ce de
For comb('abcde', 3) we have: abc abd abe acd ace ade bcd bce bde cde
Upvotes: 1
Views: 273
Reputation: 987
First of all, your function is slightly weighed down by making extra string slices to pass down at every iteration. If the string you're currently working with is length m you're doing O(m + (m-1) + (m-2) ... + 1) = O(m^2) extra work every function call to make string slices, but you could just be passing down indices to iterate between. You could also do with collecting characters of a given combo in a list and joining them in the end, as building a string of size
characters 1 character at a time with an immutable string type is an O(size^2) operation when you could be getting away with one final O(size) operation to join a generated combo before printing. So I would make these small changes, which well help make the complexity analysis a little simpler too:
def combos(string, size):
def _combos(i = 0, partial = []):
if len(partial) == size:
print(''.join(partial))
return
for j in range(i, len(string)):
partial.append(string[j])
_combos(j+1, partial)
partial.pop()
_combos()
combos('abcde', 2)
combos('abcde', 3)
I'm passing down an index i
to iterate over the original string rather than building a new string for each call, and I refactored r
to the list partial
, which I'm mutating directly and backtracking with O(1) append and pop before joining at len == size to avoid the extra size^2 time complexity per combo of building that combo one character at a time.
Letting n
stand for the length of the string, and k
= size, you could now roughly say that this algorithm runs in O(k*(n C k)) time, generating all possible n choose k = n!/(n-k)!k! combos, and then taking O(k) time to join and print each combo. This is mostly correct, but it does ignore the work it takes to recursively build each combo. If you want to be more precise than this, you have to look at how many recursive calls are being made, and how much work is being done in each recursive call. A recursion tree can be a useful way to think about this:
Our initial, single recursive call loops over the whole string, doing constant time append/pop work and making a recursive call for each of n possible suffixes (string[1:], string[2:]...string[n:]). This holds in general: the amount of work each recursive call makes is linear in the length of the suffix string[i:] it receives. The only non-constant work every recursive call makes is to generate further recursive calls on smaller suffixes of the string until depth k is reached.
level: calls: work/call: total work:
1 1 n n = O(n C 1)
Two levels down, there are n recursive calls operating on suffixes of the string, (though one operates on an empty suffix, string[n:] and it does no work/generates no further calls, so there are n-1 recursive calls we care about). Each call does work proportionate to the length of the string suffix it receives, but each received a shorter and shorter string suffix, meaning the total work done per call at this level looks like:
level: calls: work/call: total work:
2 n varies from 0 to n-1 1 + 2... + n-1 = n(n-1)/2 = O(n C 2)
These calls are going to generate n-2 size 1 calls, n-3, size 2 calls... 1 size n-2 call. It's getting a littler messy and would be easier to prove by induction at this point, but you can take it that 3 levels down, the total work is O(n C 3), and that this pattern holds up to depth k. Essentially, each level i is responsible for generating all size i combinations, and passing these combinations down to the next level to generate size i+1 combinations.
level: calls: work/call: total work:
3 n(n-1)/2 varies from 0 to n-2 n(n-1)(n-2)/(2*3) = O(n C 3)
...
k n(n-1)..(n-k+1)/(k*(k-1)...(2) O(k) k*calls = k*n!/((n-k)!*k!) = O (k*(n C k))
We can sum across all levels to get the total time complexity for the algorithm:
This is potentially worse than the original estimate of O(k*(n C k)), because we are doing some unnecessary work of building partial combinations, some of which aren't actually used. (You can verify that the sum of the combinations (n C i) from i=0 to k does represent the number of recursive calls made with the assertion in the instrumented version of the algorithm below). It is possible to bring our actual complexity for printing all combinations down to the optimal O(k*(n C k)) with a different sort of algorithm, but this algorithm isn't that far from optimal, and Gray Codes are an entirely different topic.
from functools import cache
@cache
def fact(n):
if n <= 1: return 1
return n * fact(n-1)
# n choose k: n!/(k!(n-k)!)
def nCk(n, k): return fact(n) / (fact(n-k) * fact(k))
# sum of nCk(n, i) from i = 0 to k (expected number of recursive calls)
def expect(n, k):
expectedCalls = 0
for i in range(0, k+1):
expectedCalls += nCk(n, i)
return expectedCalls
def combos(string, size):
callCount = 0
result = []
def _combos(i = 0, partial = []):
#print(f"_combos({i}, {partial})")
nonlocal callCount
callCount += 1
if len(partial) == size:
result.append(''.join(partial))
return
for j in range(i, len(string)):
partial.append(string[j])
_combos(j+1, partial)
partial.pop()
_combos()
print(f'numCombos({string}, {size}):', len(result))
print('callCount:', callCount)
assert(callCount == expect(len(string), size))
return result
print(combos('abcde', 2))
print(combos('abcde', 3))
print(combos('abcde', 5))
Upvotes: 1