Reputation: 7592

Is using fold less efficient than standard recursion

I'm going through the Learn You a Haskell book right now and I'm curious about how this particular example works. The book first demonstrates an implementation of findKey using traditional recursion:

findKey :: (Eq k) => k -> [(k,v)] -> Maybe v  
findKey key [] = Nothing  
findKey key ((k,v):xs) = if key == k  
                            then Just v  
                            else findKey key xs

The book then follows up with a shorter implementation using foldr

findKey :: (Eq k) => k -> [(k,v)] -> Maybe v  
findKey key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing

With the standard recursion, the function should immediately return once it hits the first element with the provided key. If I understand the foldr implementation correctly, it will iterate over the entire list every time, even if it matched the first element it came across. That doesn't seem like a very efficient way to handle the problem.

Is there something I'm not getting about how the foldr implementation works? Or is there some kind of magic within Haskell that makes this implementation not quite as inefficient as I think it is?

Upvotes: 11

Answers (5)

Will Ness

Reputation: 71065

foldr f z [a,b,c,...,n]                   == 
a `f` (b `f` (c `f` (... (n `f` z) ...))) == 
f a (foldr f z [b,c,...,n])               == 
f a acc   where   acc = foldr f z [b,c,...,n]

So if your f returns before forcing acc, acc remains not forced, i.e. no part of the list argument beyond its head element a is accessed, like e.g. when you have

f a acc = ...

If, on the other hand, your f does force its second argument, e.g. if it's defined as

f a (x:xs) = ...

then the acc is forced before f starts its work, and the list will be accessed in whole before the processing begins -- in whole, because acc = f b acc2 and that invocation of f must force its second argument, acc2, so its value, acc, can be forced (pattern-matched with (x:xs), that is); and so forth.

Upvotes: 2

gen

Reputation: 10040

Think of foldr using the following definition (using standard recursion):

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f e [] = e
foldr f e (x:xs) = f x (foldr f e xs)

The third line shows that the second implementation for findKey will return upon finding the first match.

As a sidenote: assume you had the following definition (which does not have identical functionality) for findKey (as an exercise you might want to rewrite the definition using foldr):

findKey :: (Eq k) => k -> [(k,v)] -> [v]
findKey key [] = []
findKey key ((kHead, vHead):rest) = if (key == kHead) then vHead:(findKey key rest) else findKey key rest

Now you might think that this would iterate through the whole input list. Depending on how you invoke this function, it could be the case that it iterates through the whole list, but at the same time this can give you the first match efficiently too. Due to Haskell's lazy evaluation the following code:

head (findKey key li)

will give you the first match (assuming that there's one) with the same efficiency as your first example.

Upvotes: 2

melpomene

Reputation: 85767

foldr is written using standard recursion.
The recursive call to foldr is hidden inside of acc. If your code doesn't use acc, it will never be computed (because Haskell is lazy). So the foldr version is efficient and will also return early.

Here's an example demonstrating this:

Prelude> foldr (\x z -> "done") "acc" [0 ..]
"done"

This expression returns "done" immediately, even though the input list is infinitely long.

If foldr is defined as:

foldr f z (x : xs) = f x (foldr f z xs)
foldr _ z [] = z

, then evaluation goes via

f x (foldr f z xs)
    where
    f = \x z -> "done"
    x = 0
    z = "acc"
    xs = ...  -- unevaluated, but is [1 ..]

which is

(\x z -> "done") 0 (foldr (\x z -> "done") "acc" [1 ..])

which turns into "done" because the first function doesn't use z, so the recursive call is never needed.

Upvotes: 13

ErikR

Reputation: 52039

Here is code which demonstrates that foldr does indeed "short-circuit" the evaluation of findKey:

import Debug.Trace

findKey :: (Eq k) => k -> [(k,v)] -> Maybe v  
findKey key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing  

tr x = trace msg x
  where msg = "=== at: " ++ show x

thelist = [ tr (1,'a'), tr (2,'b'), tr (3, 'c'), tr (4, 'd') ]

An example of running findKey in ghci:

*Main> findKey 2 thelist
=== at: (1,'a')
=== at: (2,'b')
Just 'b'
*Main>

Upvotes: 3

chi

Reputation: 116139

If I understand the foldr implementation correctly, it will iterate over the entire list every time, even if it matched the first element it came across.

This is wrong. foldr will evaluate the list only as much as needed.

E.g.

foldr (&&) True [True, False, error "unreached code here"]

returns False since the error is never evaluated, precisely as in

(True && (False && (error "unreached code here" && True)))

Indeed, since the end of the list is never reached, we can also write

foldr (&&) (error "end") [True, False, error "unreached code here"]

and still obtain False.

Upvotes: 3

Is using fold less efficient than standard recursion

Answers (5)

Related Questions