Reputation: 95
I tried to implement memoisation using arrays in a recursive fibonacci function, fibmem()
expecting the runninng time to come out as O(n). Initially, it looked as though I had it, as it took much faster than the regular recursive fibonacci function to run. (red being the regular fib()
and green being the fibmem()
)
but upon further inspection, (fibmem()
is represented by red)
it looks as though fibmem()
runs in O(someconstant^n) time instead. Here's the code:
memo = [0] * 100 #initialise the array
argument = sys.argv[1]
def fibmem(n):
if n < 0:
return "NO"
if n == 0 or n == 1:
memo[n] = 1
return memo[n]
if n not in memo:
memo[n] = fibmem(n-1) + fibmem(n-2)
return memo[n]
Now I can get fibmem()
to run in O(n) time using dictionaries instead of arrays in this way:
memo = {0:1, 1:1}
argument = sys.argv[1]
def fibmem(n):
if n not in memo:
memo[n] = fibmem(n-1) + fibmem(n-2)
return memo[n]
But I would think that my implementation using arrays is similar. I just can't see why the array implementation of fibmem()
runs in exponential time. What's going on? How would I go about fixing things?
Upvotes: 2
Views: 74
Reputation: 28596
The real problem isn't that the in
operator scans the list and takes linear time but that you're doing it completely wrong.
Your memo
will be filled with [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...]
. So when n
is for example 40, and you're thus checking 40 not in memo
, it will simply always fail, because 40 is not a Fibonacci number. Clearly you mean to check whether the 40-th Fibonacci number has already been computed, but that's not at all what you're actually checking. You're instead checking whether 40 is a (already computed) Fibonacci number.
So you get a shortcut only whenever n
itself happens to be a Fibonacci number, for example at 34. But then until 55, you never get any such shortcuts, effectively disabling your memoization entirely (in those ranges). That's why you get exponential behavior there, just like the non-memoized version does earlier on.
Also note that striking interruption of the curve between n=35 and n=36. That's not just a fluke, that's precisely because 34 is a Fibonacci number. The case n=36 goes back to n=35 and n=34, and because n=34 is an instant shortcut, only the n=35 part involves real work. That's why n=36 takes pretty much the exact same time as n=35 (it was a fluke that it took slightly less when you measured it).
Instead of if n not in memo:
you should check if memo[n] == 0:
or if not memo[n]:
.
Or instead, use a dictionary: memo = {}
. Then your if n not in memo:
does what it's supposed to do (because it checks for keys, not for values). That also has the advantage of not being limited.
Upvotes: 5
Reputation: 15300
Your problem is that the in
operator, on a list, scans the list from start to finish. It doesn't magically know that you are storing things in order.
You can use sets, which have this built in, to get around this. Or you can use array lookups to check if an array value is set:
memo = [1,1] + [0]*98
def fibmem(n):
answer = memo[n]
if answer == 0:
answer = fibmem(n-1) + fibmem(n-2)
memo[n] = answer
return answer
Upvotes: 2