Kevin Meredith
Kevin Meredith

Reputation: 41939

foldLeft v. foldRight - does it matter?

Previously, Nicolas Rinaudo answered my question on Scala's List foldRight Always Using foldLeft?

Studying Haskell currently, my understanding is that foldRight should be preferred over foldLeft in cases where :: (prepend) can be used over ++ (append).

The reason, as I understand, is performance - the former occurs in O(1), i.e. add an item to the front - constant time. Whereas the latter requires O(N), i.e. go through the whole list and add an item.

In Scala, given that foldLeft is implemented in terms of foldRight, does the benefit of using :+ over ++ with foldRight even matter since foldRight gets reversed, and then foldLeft'd?

As an example, consider this simple fold.. operation for simply returning a list's elements in order.

foldLeft folds over each element, adding each item to the list via :+.

scala> List("foo", "bar").foldLeft(List[String]()) { 
                                                    (acc, elem) => acc :+ elem }
res9: List[String] = List(foo, bar)

foldRight performs a foldLeft with :: operator on each item, but then reverses.

scala> List("foo", "bar").foldRight(List[String]()) { 
                                                    (elem, acc) => elem :: acc }
res10: List[String] = List(foo, bar)

In reality, does it matter in Scala which foldLeft or foldRight to use given that foldRight uses foldRight?

Upvotes: 27

Views: 23827

Answers (6)

Thirupathi Chavati
Thirupathi Chavati

Reputation: 1871

scala> val words = List("Hic", "Est", "Index")
words: List[String] = List(Hic, Est, Index)

Incase of foldLeft: List elements will add to the empty string first and followed

words.foldLeft("")(_ + _) == (("" + "Hic") + "Est") + "Index"      //"HicEstIndex"

Incase of foldRight: Empty string will add to end of elements

words.foldRight("")(_ + _) == "Hic" + ("Est" + ("Index" + ""))     //"HicEstIndex"

Both cases will return the same output

def foldRight[B](z: B)(f: (A, B) => B): B
def foldLeft[B](z: B)(f: (B, A) => B): B

Upvotes: 3

Nicolas Rinaudo
Nicolas Rinaudo

Reputation: 6178

It does in fact matter whether you use foldLeft or foldRight in Scala, at least with lists, at least in the default implementation. I believe this answer not to hold true for libraries such as scalaz, though.

If you look at the source code of foldLeft and foldRight for LinearSeqOptimized, you'll see that:

  • foldLeft is implemented with a loop and local mutable variables, and fits in one stack frame.
  • foldRight is recursive, but not tail recursive, and thus consumes one stack frame per element in the list.

foldLeft is thus safe, while foldRight might stack overflow for long lists.

Edit To complete my answer, as it only adresses part of your question: it also matters which one you use depending on what you intend to do.

To take your example, what I consider the optimal solution is to use foldLeft, prepending results to your accumulator, and reverse the result.

This way:

  • the whole operation is O(n)
  • it will not overflow the stack, regardless of the size of the list

This is essentially what you thought you were doing with foldRight when assuming that it was implemented in terms of foldLeft.

Should you use foldRight, you'll get a slightly faster implementation (well, slightly... twice as fast, really, but still O(n)) at the cost of safety.

One could argue that, if you know your lists are going to be small enough, there is no safety issue and you can use foldRight. I feel, but that's only an opinion, that if your lists are small enough that you don't have to worry about your stack, they're small enough that you don't have to worry about the performance hit either.

Upvotes: 11

tgr
tgr

Reputation: 3608

It depends, consider the following:

scala> val l = List(1, 2, 3)
l: List[Int] = List(1, 2, 3)

scala> l.foldLeft(List.empty[Int]) { (acc, ele) => ele :: acc }
res0: List[Int] = List(3, 2, 1)

scala> l.foldRight(List.empty[Int]) { (ele, acc) => ele :: acc }
res1: List[Int] = List(1, 2, 3)

As you can see, foldLeft traverses the list from head to the last element. foldRight on the other hand traverses it from the last element to head.

If you use folding for agregation, there should be no difference:

scala> l.foldLeft(0) { (acc, ele) => ele + acc }
res2: Int = 6

scala> l.foldRight(0) { (ele, acc) => ele + acc }
res3: Int = 6

Upvotes: 6

Benjamin Kovach
Benjamin Kovach

Reputation: 3260

I'm no Scala expert, but in Haskell, one of the most important differentiating features between foldl' (which really should be the default left fold) and foldr is that foldr will work on infinite data structures, where foldl' will hang indefinitely.

In order to understand why this is, I recommend visiting foldl.com and foldr.com, expanding the evaluations a couple of times, and reconstructing the call tree. You'll quickly see where foldr is appropriate versus foldl'.

Upvotes: 2

Rein Henrichs
Rein Henrichs

Reputation: 15605

I can provide an answer for Haskell, but I doubt it will be relevant to Scala:

Let's start with the source for both,

foldl f z []     = z
foldl f z (x:xs) = foldl f (f z x) xs

foldr f z []     = z
foldr f z (x:xs) = f x (foldr f z xs)

Now, let's look at where the recursive call to foldl or foldr appears on the right hand side. For foldl, it is outermost. For foldr, it is inside the application of f. This has a couple important implications:

  1. If f is a data constructor, that data constructor will be left-most, outermost with foldr. This means that foldr implements guarded recursion, such that the following is possible:

    > take 5 . foldr (:) [] $ [1..]
    [1,2,3,4]
    

    This means that, e.g., foldr can be both a good producer and a good consumer for short-cut fusion. (Yes, foldr (:) [] is an identity morphism for lists.)

    This is not possible with foldl because the constructor will be inside the recursive call to foldl and cannot be pattern matched.

  2. Conversely, because the recursive call to foldl is in left-most, outermost position, it will be reduced by lazy evaluation and will not take up space on the pattern-matching stack. Combined with proper strictness annotation (e.g., foldl'), this allows functions like sum or length to run in constant space.

For more on this, see Lazy Evaluation of Haskell.

Upvotes: 13

sjrd
sjrd

Reputation: 22105

@Rein Henrichs' answer is indeed irrelevant to Scala, because Scala's implementation of foldLeft and foldRight is completely different (for starters, Scala has eager evaluation).

foldLeft and foldRight themselves have actually very little to do wrt the performance of your program. Both are (liberally speaking) O(n*c_f) where c_f is the complexity of one call to the function f that is given. foldRight is slower by a constant factor because of the additional reverse, though.

So the real factor that differentiates one from the other is the complexity of the anonymous function that you give. Sometimes, it is easier to write an efficient function designed to work with foldLeft, and sometimes to foldRight. In your example, the foldRight version is best, because the anonymous function that you give to foldRight is O(1). In contrast, the anonymous function that you give to foldLeft is O(n) itself (amortized, which is what matters here), because acc keeps growing from 0 to n-1, and appending to a list of n elements is O(n).

So it actually matters whether you choose foldLeft or foldRight, but not because of these functions themselves, but because of the anonymous functions given to them. If both are equivalent, choose foldLeft by default.

Upvotes: 52

Related Questions