Reputation: 4738
After reading a memoization introduction I reimplemented the Fibonacci example by using a more general memoize function (only for learning purposes):
memoizer :: (Int -> Integer) -> Int -> Integer
memoizer f = (map f [0 ..] !!)
memoized_fib :: Int -> Integer
memoized_fib = memoizer fib
where fib 0 = 0
fib 1 = 1
fib n = memoized_fib (n-2) + memoized_fib (n-1)
This works, but when I just change the last line to the following code, memoization suddenly does not work as I expected (the program becomes slow again):
fib n = memoizer fib (n-2) + memoizer fib (n-1)
Where is the crucial difference w.r.t. to memoization?
Upvotes: 7
Views: 864
Reputation: 71065
It is about explicit vs. implicit sharing. When you explicitly name a thing, it naturally can be shared, i.e. exist as separate entity in memory, and reused. (Of course sharing is not part of the language per se, we can only nudge the compiler ever so slightly towards sharing certain things).
But when you write same expression twice or thrice, you rely on compiler to replace the common sub-expressions with one explicitly shared entity. That might or might not happen.
Your first variant is equivalent to
memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!) where
fib 0 = 0
fib 1 = 1
fib n = memoized_fib (n-2) + memoized_fib (n-1)
Here you specifically name an entity, and refer to it by that name. But that is a function. To make the reuse even more certain, we can name the actual list of values that gets shared here, explicitly:
memoized_fib :: Int -> Integer
memoized_fib = (fibs !!) where
fibs = map fib [0 ..]
fib 0 = 0
fib 1 = 1
fib n = memoized_fib (n-2) + memoized_fib (n-1)
The last line can be made yet more visually apparent, with explicit reference to the actual entity which is shared here - the list fibs
which we just named in the step above:
fib n = fibs !! (n-2) + fibs !! (n-1)
Your second variant is equivalent to this:
memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!) where
fib 0 = 0
fib 1 = 1
fib n = (map fib [0 ..] !!) (n-2) + (map fib [0 ..] !!) (n-1)
Here we have three seemingly independent map
expressions, which might or might not get shared by a compiler. Compiling it with ghc -O2
seems to reintroduce sharing, and with it the speed.
Upvotes: 6
Reputation: 13223
momoized_fib = ...
- that's top-level simple definition. it might be read as a constant lazy value (without any additional arguments required to be bound before expanding it. That's kinda "source" of your memoized values.
When you use (memoizer fib) (n-2)
creates new source of values which have no relation with memoized_fib
and thus it isn't reused. Actually you move a lot of load on GC here since you produce a lot (map fib [0 ..])
sequences in second variant.
Consider also more simple example:
f = \n -> sq !! n where sq = [x*x | x <- [0 ..]]
g n = sq !! n where sq = [x*x | x <- [0 ..]]
First will generate single f
and associated with it sq
because there is no n
in head of declaration. Second will produce family of lists for each different value of f n
and move over it (without bounding down to actual values) to get value.
Upvotes: 3