How does argument passing order affect lazy evaluation in Haskell?

Question

I've been trying to understand lazy evaluation in Haskell and I understood it basically as only evaluate when you have to. But when trying to implement fibonacci efficiently I came across this (weird?) behaviour: This Implementation:

--wrapper function used by both implementations
fib :: Int -> Int
fib x = if x < 0 then 0 else fib2 0 1 x

fib2 :: Int -> Int -> Int -> Int
fib2 x y 0 = x
fib2 x y z = fib2 y (x + y) (z - 1)

will work even when called with

fib 20000000
> -70318061090422843

while swapping the passed arguments in the recursive call:

fib2 :: Int -> Int -> Int -> Int
fib2 x y 0 = x
fib2 x y z = fib2 (x + y) x (z - 1)

results in:

fib 20000000
>*** Exception: stack overflow

Why don't I have to tell the compiler to eagerly evaluate in the first example? Why does the second example not work while the first does?

I used the GHCi 8.0.1 on Windows 10 for this.

Reid Barton · Accepted Answer

First, notice that the difference between your two versions is quantitative, not qualitative. The first will stack overflow on an input of 40000000, and the second will complete successfully on an input of 10000000. It looks like the second version uses twice as much stack as the first.

Basically, the reason is that, if we introduce the notation {n} for the thunks that live in the x and y arguments, your first version does

fib2 {n} {n+1} 0 = {n}
fib2 {n} {n+1} z = let {n+2} = (+) {n} {n+1}    -- build a thunk
                    in fib2 {n+1} {n+2} (z - 1)

while the second version does

fib2 {n+1} {n} 0 = {n+1}
fib2 {n+1} {n} z = let {n+2} = (+) {n+1} {n}    -- build a thunk
                    in fib2 {n+2} {n+1} (z - 1)

Now consider what happens when the fib2 recursion finishes and it's time to evaluate {n} (or {n+1}; let's just ignore this difference). Each of the thunks {0}, ..., {n} will get evaluated just once, in some order. It happens that (+) evaluates its left argument first, then its right argument. For definiteness, let's just take n = 6. The evaluation looks like

{6} = (+) {4} {5}
... {4} = (+) {2} {3}
... ... {2} = (+) {0} {1}
... ... ... {0} = 0
... ... ... {1} = 1
... ... {2} = 1
... ... {3} = (+) {1} {2}
... ... ... {1} = 1
... ... ... {2} = 1   -- we already calculated it
... ... {3} = 2
... {4} = 3
... {5} = ......

The stack will never get deeper than n/2 levels, since we recurse from {n} to {n-2} first and when we need to compute {n-1} we have already computed {n-2}.

In contrast, in the second version we recurse from {n} to {n-1} first, so the stack will have n levels.

How does argument passing order affect lazy evaluation in Haskell?

Answers (1)

Related Questions