brander
brander

Reputation: 121

Haskell recursion performance calculating central binomial coefficients

I am new to Haskell and learning how to properly use recursion.

The following function (which uses a formula to calculate central binomial coefficients) is extremely slow; for instance, grid (20,20) crashes my laptop. Can you help me understand why?

grid::(Integer,Integer)->Integer

grid (1, x)  = 1 + x

grid (x, 1)  = 1 + x

grid (x, y)  = grid ((x-1),y) + grid ((x),(y-1))

Upvotes: 1

Views: 534

Answers (4)

karakfa
karakfa

Reputation: 67567

Alternative to memoization is generating rows iteratively, reducing the number of computations.

central :: [Integer] -> [Integer]
central x = zipWith (+) x (0:central x)

For example, to generate the next row from previous

> central [1,2,3]
[1,3,6]

or for your grid function

grid x y = (iterate central [1..]) !! x !! y

and for zero based index

> grid 2 4
35

Upvotes: 0

Petr
Petr

Reputation: 63409

As explained in the other answers, the problem with your implementation is that the number of recursive calls is exponential, even though the number of distinct values of grid (x,y) that need to be computed is just quadratic.

The solution to the problem is called memoization, which basically means caching values that have been computed before. I definitely recommend you to write your own implementation based on lists, as recommended by @bheklilr.

There is however a quick solution offered by existing libraries such as MemoTrie:

import Data.MemoTrie

grid :: (Integer, Integer) -> Integer
grid = memo grid'

grid' :: (Integer, Integer) -> Integer
grid' (1, x)  = 1 + x
grid' (x, 1)  = 1 + x
grid' (x, y)  = grid (x - 1, y) + grid (x , y - 1)

Notice that now grid is defined as a value - it's not polymorphic and it takes no arguments (although it's value is a function). The call to memo creates a single instance of a trie that caches all values and uses grid' to compute values not present in the cache.

Upvotes: 0

mortrax
mortrax

Reputation: 11

The reason why the execution of this function grinds to a crawl is that it utilizes multiple recursion e.g. the function calls itself twice upon each recursive call. That means that there are repeated computations taking place during the execution of this recursive function, and that the time complexity of the computation increases exponentially as the size of the inputs increase.

The effects of this are more noticeable with larger input values like 20.

Let's look at a call to grid(5, 5).

This expands as follows.

grid(5, 5)

grid(4, 5) + grid(5, 4)

(grid(3, 5) + grid(4, 4)) + (grid(4, 4) + grid(5, 3))

((grid(2, 5) + grid(3, 4)) + (grid(4, 3) + grid(3, 4))) + 
((grid(3, 4) + grid(4, 3)) + (grid(4, 3) + grid(5, 2)))

...and so on

As you can see things get out of hand quickly even with small values of x and y, grid(3, 4) and grid(4, 3) are calculated multiple times. As stated previously, a solution that utilizes zipWith will be much more efficient.

Upvotes: 1

bheklilr
bheklilr

Reputation: 54078

Notably, there's not caching or memoization in your algorithm. GHC does not do magic and will not optimize away problems like that. For a 5x5 grid you're calling grid 139 times, for a 6x6 503, for a 7x7 it's 1847, and for 10x10 it's 97239 times. By the time you get to 20x20 you're making so many recursive calls that it's just not feasible. It's the same concept as doing

fib 0 = 1
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)

You're going to have an exponential number of calls as you increase n, slowing you down. Instead, you can approach this problem similarly to how it's solved in the case of the Fibonacci sequence, using lists and memoization:

fibs = 1 : 1 : zipWith (+) fibs (tail fibs)

Except here you want it to calculate the binomial coefficients. As for the implementation of such an algorithm, you'll have to figure that out yourself ;) I can point you at a previous answer of mine that solved the problem for generating Pascal's triangle.

Upvotes: 4

Related Questions