Reputation: 121
I am new to Haskell and learning how to properly use recursion.
The following function (which uses a formula to calculate central binomial coefficients) is extremely slow; for instance, grid (20,20)
crashes my laptop. Can you help me understand why?
grid::(Integer,Integer)->Integer
grid (1, x) = 1 + x
grid (x, 1) = 1 + x
grid (x, y) = grid ((x-1),y) + grid ((x),(y-1))
Upvotes: 1
Views: 534
Reputation: 67567
Alternative to memoization is generating rows iteratively, reducing the number of computations.
central :: [Integer] -> [Integer]
central x = zipWith (+) x (0:central x)
For example, to generate the next row from previous
> central [1,2,3]
[1,3,6]
or for your grid function
grid x y = (iterate central [1..]) !! x !! y
and for zero based index
> grid 2 4
35
Upvotes: 0
Reputation: 63409
As explained in the other answers, the problem with your implementation is that the number of recursive calls is exponential, even though the number of distinct values of grid (x,y)
that need to be computed is just quadratic.
The solution to the problem is called memoization, which basically means caching values that have been computed before. I definitely recommend you to write your own implementation based on lists, as recommended by @bheklilr.
There is however a quick solution offered by existing libraries such as MemoTrie:
import Data.MemoTrie
grid :: (Integer, Integer) -> Integer
grid = memo grid'
grid' :: (Integer, Integer) -> Integer
grid' (1, x) = 1 + x
grid' (x, 1) = 1 + x
grid' (x, y) = grid (x - 1, y) + grid (x , y - 1)
Notice that now grid
is defined as a value - it's not polymorphic and it takes no arguments (although it's value is a function). The call to memo
creates a single instance of a trie that caches all values and uses grid'
to compute values not present in the cache.
Upvotes: 0
Reputation: 11
The reason why the execution of this function grinds to a crawl is that it utilizes multiple recursion e.g. the function calls itself twice upon each recursive call. That means that there are repeated computations taking place during the execution of this recursive function, and that the time complexity of the computation increases exponentially as the size of the inputs increase.
The effects of this are more noticeable with larger input values like 20.
Let's look at a call to grid(5, 5).
This expands as follows.
grid(5, 5)
grid(4, 5) + grid(5, 4)
(grid(3, 5) + grid(4, 4)) + (grid(4, 4) + grid(5, 3))
((grid(2, 5) + grid(3, 4)) + (grid(4, 3) + grid(3, 4))) +
((grid(3, 4) + grid(4, 3)) + (grid(4, 3) + grid(5, 2)))
...and so on
As you can see things get out of hand quickly even with small values of x and y, grid(3, 4) and grid(4, 3) are calculated multiple times. As stated previously, a solution that utilizes zipWith will be much more efficient.
Upvotes: 1
Reputation: 54078
Notably, there's not caching or memoization in your algorithm. GHC does not do magic and will not optimize away problems like that. For a 5x5 grid you're calling grid
139 times, for a 6x6 503, for a 7x7 it's 1847, and for 10x10 it's 97239 times. By the time you get to 20x20 you're making so many recursive calls that it's just not feasible. It's the same concept as doing
fib 0 = 1
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)
You're going to have an exponential number of calls as you increase n
, slowing you down. Instead, you can approach this problem similarly to how it's solved in the case of the Fibonacci sequence, using lists and memoization:
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
Except here you want it to calculate the binomial coefficients. As for the implementation of such an algorithm, you'll have to figure that out yourself ;) I can point you at a previous answer of mine that solved the problem for generating Pascal's triangle.
Upvotes: 4