ohmree
ohmree

Reputation: 375

How does GHC know how to cache one function but not the others?

I'm reading Learn You a Haskell (loving it so far) and it teaches how to implement elem in terms of foldl, using a lambda. The lambda solution seemed a bit ugly to me so I tried to think of alternative implementations (all using foldl):

import qualified Data.Set as Set
import qualified Data.List as List

-- LYAH implementation
elem1 :: (Eq a) => a -> [a] -> Bool
y `elem1` ys = 
    foldl (\acc x -> if x == y then True else acc) False ys

-- When I thought about stripping duplicates from a list
-- the first thing that came to my mind was the mathematical set
elem2 :: (Eq a) => a -> [a] -> Bool
y `elem2` ys = 
    head $ Set.toList $ Set.fromList $ filter (==True) $ map (==y) ys

-- Then I discovered `nub` which seems to be highly optimized: 
elem3 :: (Eq a) => a -> [a] -> Bool
y `elem3` ys = 
    head $ List.nub $ filter (==True) $ map (==y) ys

I loaded these functions in GHCi and did :set +s and then evaluated a small benchmark:

3 `elem1` [1..1000000] --  => (0.24 secs, 160,075,192 bytes)
3 `elem2` [1..1000000] --  => (0.51 secs, 168,078,424 bytes)
3 `elem3` [1..1000000] --  => (0.01 secs, 77,272 bytes)

I then tried to do the same on a (much) bigger list:

3 `elem3` [1..10000000000000000000000000000000000000000000000000000000000000000000000000]

elem1 and elem2 took a very long time, while elem3 was instantaneous (almost identical to the first benchmark).
I think this is because GHC knows that 3 is a member of [1..1000000], and the big number I used in the second benchmark is bigger than 1000000, hence 3 is also a member of [1..bigNumber] and GHC doesn't have to compute the expression at all.
But how is it able to automatically cache (or memoize, a term that Land of Lisp taught me) elem3 but not the two other ones?

Upvotes: 3

Views: 148

Answers (2)

nicolas
nicolas

Reputation: 9805

Explanation

foldl goes to the innermost element of the list, applies the computation, and does so again recursively to the result and the next innermost value of the list, and so on.

foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn

So in order to produce the result, it has to traverse all the list.

Conversely, in your function elem3 as everything is lazy, nothing gets computed at all, until you call head.

But in order to compute that value, you just the first value of the (filtered) list, so you just need to go as far as 3 is encountered in your big list. which is very soon, so the list is not traversed. if you asked for the 1000000th element, eleme3 would probably perform as badly as the other ones.

Lazyness

Lazyness ensure that your language is always composable : breaking a function into subfunction does not changes what is done.

What you are seeing can lead to a space leak which is really about how control flow works in a lazy language. both in strict and in lazy, your code will decide what gets evaluated, but with a subtle difference :

  • In a strict language, the builder of the function will choose, as it forces evaluation of its arguments: whoever is called is in charge.

  • In a lazy language, the consumer of the function chooses. whoever called is in charge. It may choose to only evaluate the first element (by calling head), or every other element. All that provided its own caller choose to evaluate his own computation as well. there is a whole chain of command deciding what to do.

In that reading, your foldl based elem function uses that "inversion of control" in an essential way : elem gets asked to produce a value. foldl goes deep inside the list. if the first element if y then it return the trivial computation True. if not, it forwards the requests to the computation acc. In other words, what you read as values acc, x or even True, are really placeholders for computations, which you receive and yield back. And indeed, acc may be some unbelievably complex computation (or divergent one like undefined), as long as you transfer control to the computation True, your caller will never see the existence of acc.

foldr vs foldl vs foldl' VS speed

As suggested in another answer, foldr might best your intent on how to traverse the list, and will shield you away from space leaks (whereas foldl' will prevent space leaks as well if you really want to traverse the other way, which can lead to buildup of complex computations ... and can be very useful for circular computation for instance).

But the speed issue is really an algorithmic one. There might be better data structure for set membership if and only if you know beforehand that you have a certain pattern of usage.

For instance, it might be useful to pay some upfront cost to have a Set, then have fast membership queries, but that is only useful if you know that you will have such a pattern where you have a few sets and lots of queries to those sets. Other data structure are optimal for other patterns, and it's interesting to note that from a API/specification/interface point of view, they are usually the same to the consumer. That's a general phenomena in any languages, and why many people love abstract data types/modules in programming.

Using foldr and expecting to be faster really encodes the assumption that, given your static knowledge of your future access pattern, the values you are likely to test membership of will sit at the beginning. Using foldl would be fine if you expect your values to be at the end of it.

Note that using foldl, you might construct the entire list, you do not construct the values themselves, until you need it of course, for instance to test for equality, as long as you have not found the searched element.

Upvotes: 5

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476574

Short answer: this has nothing to do with caching, but the fact that you force Haskell in the first two implementations, to iterate over all elements.

No, this is because foldl works left to right, but it will thus keep iterating over the list until the list is exhausted.

Therefore you better use foldr. Here from the moment it finds a 3 it in the list, it will cut off the search.

This is because foldris defined as:

foldr f z [x1, x2, x3] = f x1 (f x2 (f x3 z))

whereas foldl is implemented as:

foldl f z [x1, x2, x3] = f (f (f (f z) x1) x2) x3

Note that the outer f thus binds with x3, so that means foldl first so if due to laziness you do not evaluate the first operand, you still need to iterate to the end of the list.

If we implement the foldl and foldr version, we get:

y `elem1l` ys = foldl (\acc x -> if x == y then True else acc) False ys
y `elem1r` ys = foldr (\x acc -> if x == y then True else acc) False ys

We then get:

Prelude> 3 `elem1l` [1..1000000]
True
(0.25 secs, 112,067,000 bytes)
Prelude> 3 `elem1r` [1..1000000]
True
(0.03 secs, 68,128 bytes)

Stripping the duplicates from the list will not imrpove the efficiency. What here improves the efficiency is that you use map. map works left-to-right. Note furthermore that nub works lazy, so nub is here a no op, since you are only interested in the head, so Haskell does not need to perform memberchecks on the already seen elements.

The performance is almost identical:

Prelude List> 3 `elem3` [1..1000000]
True
(0.03 secs, 68,296 bytes)

In case you work with a Set however, you do not perform uniqueness lazily: you first fetch all the elements into the list, so again, you will iterate over all the elements, and not cut of the search after the first hit.

Upvotes: 6

Related Questions