Reputation: 938
I'm concerned about efficiency in Haskell's lazy evaluation. consider following code
main = print $ x + x
where x = head [1..]
here, x
first hold the expression of head [1..]
instead of the result 1
, due to the laziness,
but then when I call x + x
, will the expression head [1..]
be executed twice?
I found the following description on haskell.org
Lazy evaluation, on the other hand, means only evaluating an expression when its results are needed (note the shift from "reduction" to "evaluation"). So when the evaluation engine sees an expression it builds a thunk data structure containing whatever values are needed to evaluate the expression, plus a pointer to the expression itself. When the result is actually needed the evaluation engine calls the expression and then replaces the thunk with the result for future reference.
So does this mean that, in x + x
, when calling the first x
, head [1..]
is executed and x
is re-assigned to 1
, and the second x
is just calling a reference of it?
Did I understand this right?
Upvotes: 3
Views: 842
Reputation: 7493
At first, x
is just a thunk. You can see that as follows:
λ Prelude> let x = head [1..]
λ Prelude> :sprint x
x = _
Here the _
indicates that x
has not yet been evaluated. Its mere definition is recorded.
Then, you can understand how x + x
is constructed by just realizing that x
is a pointer to this thunk: both those x
will point to the same thunk. Once one is evaluated, the other is, since it's the same thunk.
You can see that with ghc-vis
:
λ Prelude> :vis
λ Prelude> :view x
λ Prelude> :view x + x
should show you something along the lines of:
Here you can see that the x + x
thunk actually points twice to the x
thunk.
Now, if you evaluate x
, by printing it for example:
λ Prelude> print x
You'll obtain:
You can see here that the x
thunk is no longer a thunk: it's the value 1.
Upvotes: 9
Reputation: 71590
This is more of a question about particular Haskell implementations than about Haskell itself, since the language makes no particular guarantees about how things are evaluated.
But in GHC (and most other implementations, as far as I'm aware): yes, when thunks are evaluated they are replaced by the result internally, so other references to the same thunk benefit from the work done evaluating it the first time.
The caveat is that there are no real guarantees about which expressions end up implemented as references to the same thunk. The compiler is in general allowed to make whatever transformations to your code it likes so long as the result is the same. Of course, the reason to implement code transformations in a compiler is usually to try to make the code faster, so it's hopefully not likely to rewrite things in such a way as to make it worse, but it can never be perfect.
In practice though, you're usually pretty safe assuming that whenever you give an expression a name (as in where x = head [1..]
), then all uses of that name (within the scope of the binding) will be references to a single thunk.
Upvotes: 11
Reputation: 74244
There are two ways to evaluate an expression:
Consider the following function:
select x y z = if x > z then x else y
Now let's call it:
select (2 + 3) (3 + 4) (1 + 2)
How will this be evaluated?
Strict evaluation: Evaluate innermost first.
select (2 + 3) (3 + 4) (1 + 2)
select 5 (3 + 4) (1 + 2)
select 5 7 (1 + 2)
select 5 7 3
if 5 > 3 then 5 else 7
if True then 5 else 7
5
Strict evaluation took 6 reductions. To evaluate select
we first had to evaluate its arguments. In strict evaluation the arguments to a function are always fully evaluated. Hence functions are "call by value". Thus there's no extra bookkeeping.
Lazy evaluation: Evaluate outermost first.
select (2 + 3) (3 + 4) (1 + 2)
if (2 + 3) > (1 + 2) then (2 + 3) else (3 + 4)
if 5 > (1 + 2) then 5 else (3 + 4)
if 5 > 3 then 5 else (3 + 4)
if True then 5 else (3 + 4)
5
Lazy evaluation only took 5 reductions. We never used (3 + 4)
and hence we never evaluated it. In lazy evaluation we can evaluate a function without evaluating its arguments. The arguments are only evaluated when needed. Hence functions are "call by need".
However "call by need" evaluation strategies need extra bookkeeping - you need to keep a track of whether an expression has been evaluated. In the above expression when we evaluate x = (2 + 3)
we don't need to evaluate it again. However we do need to keep a track of whether it was evaluated.
Haskell supports both strict and lazy evaluation. However it supports lazy evaluation by default. To enable strict evaluation you would have to use the special seq
and deepSeq
functions.
Similarly you can have lazy evaluation in strict languages like JavaScript. However you would need to keep a track of whether an expression has been evaluated or not. You could research about implementing thunks in JavaScript or similar languages.
Upvotes: 3