Reputation: 41939
Previously, Nicolas Rinaudo answered my question on Scala's List foldRight Always Using foldLeft?
Studying Haskell currently, my understanding is that foldRight
should be preferred over foldLeft
in cases where ::
(prepend) can be used over ++
(append).
The reason, as I understand, is performance - the former occurs in O(1)
, i.e. add an item to the front - constant time. Whereas the latter requires O(N)
, i.e. go through the whole list and add an item.
In Scala, given that foldLeft
is implemented in terms of foldRight
, does the benefit of using :+
over ++
with foldRight
even matter since foldRight
gets reversed, and then foldLeft'd
?
As an example, consider this simple fold..
operation for simply returning a list's elements in order.
foldLeft
folds over each element, adding each item to the list via :+
.
scala> List("foo", "bar").foldLeft(List[String]()) {
(acc, elem) => acc :+ elem }
res9: List[String] = List(foo, bar)
foldRight
performs a foldLeft with ::
operator on each item, but then reverses.
scala> List("foo", "bar").foldRight(List[String]()) {
(elem, acc) => elem :: acc }
res10: List[String] = List(foo, bar)
In reality, does it matter in Scala which foldLeft
or foldRight
to use given that foldRight
uses foldRight
?
Upvotes: 27
Views: 23827
Reputation: 1871
scala> val words = List("Hic", "Est", "Index")
words: List[String] = List(Hic, Est, Index)
Incase of foldLeft: List elements will add to the empty string first and followed
words.foldLeft("")(_ + _) == (("" + "Hic") + "Est") + "Index" //"HicEstIndex"
Incase of foldRight: Empty string will add to end of elements
words.foldRight("")(_ + _) == "Hic" + ("Est" + ("Index" + "")) //"HicEstIndex"
Both cases will return the same output
def foldRight[B](z: B)(f: (A, B) => B): B
def foldLeft[B](z: B)(f: (B, A) => B): B
Upvotes: 3
Reputation: 6178
It does in fact matter whether you use foldLeft
or foldRight
in Scala, at least with lists, at least in the default implementation. I believe this answer not to hold true for libraries such as scalaz, though.
If you look at the source code of foldLeft
and foldRight
for LinearSeqOptimized, you'll see that:
foldLeft
is implemented with a loop and local mutable variables, and fits in one stack frame.foldRight
is recursive, but not tail recursive, and thus consumes one stack frame per element in the list.foldLeft
is thus safe, while foldRight
might stack overflow for long lists.
Edit To complete my answer, as it only adresses part of your question: it also matters which one you use depending on what you intend to do.
To take your example, what I consider the optimal solution is to use foldLeft
, prepending results to your accumulator, and reverse
the result.
This way:
This is essentially what you thought you were doing with foldRight
when assuming that it was implemented in terms of foldLeft
.
Should you use foldRight
, you'll get a slightly faster implementation (well, slightly... twice as fast, really, but still O(n)) at the cost of safety.
One could argue that, if you know your lists are going to be small enough, there is no safety issue and you can use foldRight
. I feel, but that's only an opinion, that if your lists are small enough that you don't have to worry about your stack, they're small enough that you don't have to worry about the performance hit either.
Upvotes: 11
Reputation: 3608
It depends, consider the following:
scala> val l = List(1, 2, 3)
l: List[Int] = List(1, 2, 3)
scala> l.foldLeft(List.empty[Int]) { (acc, ele) => ele :: acc }
res0: List[Int] = List(3, 2, 1)
scala> l.foldRight(List.empty[Int]) { (ele, acc) => ele :: acc }
res1: List[Int] = List(1, 2, 3)
As you can see, foldLeft
traverses the list from head
to the last element. foldRight
on the other hand traverses it from the last element to head
.
If you use folding for agregation, there should be no difference:
scala> l.foldLeft(0) { (acc, ele) => ele + acc }
res2: Int = 6
scala> l.foldRight(0) { (ele, acc) => ele + acc }
res3: Int = 6
Upvotes: 6
Reputation: 3260
I'm no Scala expert, but in Haskell, one of the most important differentiating features between foldl'
(which really should be the default left fold) and foldr
is that foldr
will work on infinite data structures, where foldl'
will hang indefinitely.
In order to understand why this is, I recommend visiting foldl.com and foldr.com, expanding the evaluations a couple of times, and reconstructing the call tree. You'll quickly see where foldr
is appropriate versus foldl'
.
Upvotes: 2
Reputation: 15605
I can provide an answer for Haskell, but I doubt it will be relevant to Scala:
Let's start with the source for both,
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Now, let's look at where the recursive call to foldl or foldr appears on the right hand side. For foldl, it is outermost. For foldr, it is inside the application of f. This has a couple important implications:
If f
is a data constructor, that data constructor will be left-most, outermost with foldr. This means that foldr implements guarded recursion, such that the following is possible:
> take 5 . foldr (:) [] $ [1..]
[1,2,3,4]
This means that, e.g., foldr can be both a good producer and a good consumer for short-cut fusion. (Yes, foldr (:) []
is an identity morphism for lists.)
This is not possible with foldl because the constructor will be inside the recursive call to foldl and cannot be pattern matched.
Conversely, because the recursive call to foldl is in left-most, outermost position, it will be reduced by lazy evaluation and will not take up space on the pattern-matching stack. Combined with proper strictness annotation (e.g., foldl'
), this allows functions like sum
or length
to run in constant space.
For more on this, see Lazy Evaluation of Haskell.
Upvotes: 13
Reputation: 22105
@Rein Henrichs' answer is indeed irrelevant to Scala, because Scala's implementation of foldLeft
and foldRight
is completely different (for starters, Scala has eager evaluation).
foldLeft
and foldRight
themselves have actually very little to do wrt the performance of your program. Both are (liberally speaking) O(n*c_f) where c_f is the complexity of one call to the function f
that is given. foldRight
is slower by a constant factor because of the additional reverse
, though.
So the real factor that differentiates one from the other is the complexity of the anonymous function that you give. Sometimes, it is easier to write an efficient function designed to work with foldLeft
, and sometimes to foldRight
. In your example, the foldRight
version is best, because the anonymous function that you give to foldRight
is O(1). In contrast, the anonymous function that you give to foldLeft
is O(n) itself (amortized, which is what matters here), because acc
keeps growing from 0 to n-1, and appending to a list of n elements is O(n).
So it actually matters whether you choose foldLeft
or foldRight
, but not because of these functions themselves, but because of the anonymous functions given to them. If both are equivalent, choose foldLeft
by default.
Upvotes: 52