Reputation: 5231
I came across that Clojure has clojure.core.reducers/fold
function.
Also Scala has built-in fold
function but could not understand that they are working differently or not?
Upvotes: 2
Views: 537
Reputation: 44967
I assume that you are talking about clojure.core.reducers/fold.
Scala's default fold
implementation on sequences is very simple:
collection.fold(identityElem)(binOp)
simply starts with the identityElem
and then traverses the collection sequentially, and applies the binary operation binOp
to the already accumulated result and the current sequence value, e.g.
(1 to 3).fold(42000)(_ + _)
will result in 42000 + 1 + 2 + 3
= 42006
.
Clojure's fold
with the full signature
(r/fold n combinef reducef coll)
from the above mentioned package works in parallel in two stages. First, it splits the input into smaller groups of size n
(approximately), then reduces each group using reducef
, and finally combines the results of each group using combinef
.
The main difference is that combinef
is expected to be both a zeroary and binary at the same time (Clojure has multi-ary functions), and (combinef)
(without arguments) will be invoked to produce identity elements for each partition (thus, this documentation is correct, and this documentation lies).
That is, in order to simulate Scala's fold from the above example, one would have to write something like this:
(require '[clojure.core.reducers :as r])
(r/fold 3 (fn ([] 42000) ([x y] y)) + [1 2 3])
And in general, Scala's fold
collection.fold(identityElement)(binOp)
can be emulated by reducers/fold
as follows:
(r/fold collectionSize (fn ([] identityElem) ([x y] y)) binOp collection)
(note the ([x y] y)
contraption that throws away the first argument, it's intentional).
I guess the interface wasn't intended to be used with any zero-binary operations that are not monoids, that's the reason why Scala's fold
is so awkward to simulate using Clojure's fold
. If you want something that behaves like Scala's fold
, use reduce
in Clojure.
EDIT
Oh, wait. The documentation actually states that
combinef must be associative, and, when called with no arguments, (combinef) must produce its identity element
that is, we are actually forced to use a monoid as the combinef
, so the above 42000, ([x y] y)
-example is actually invalid, and the behavior is actually undefined. The fact that I somehow got the 42006
out was a hack in the strictly technical sense that it relied on undefined behavior of a library function to obtain the desired result 42006
.
Taking this extra information into account, I'm not sure whether Scala's fold
can be simulated by Clojure's core.reducers/fold
at all. Clojure's fold
seems to be constrained to reductions with a monoid, whereas Scala's fold is closer to the general List
catamorphism, at the expense of parallelism.
Upvotes: 2
Reputation: 29984
The clojure.core.reducers
namespace is a specialized implementation designed for parallel processing of large datasets. You can find full docs here:
https://clojure.org/reference/reducers.
(r/fold reducef coll)
(r/fold combinef reducef coll)
(r/fold n combinef reducef coll)
r/fold takes a reducible collection and partitions it into groups of approximately n (default 512) elements. Each group is reduced using the reducef function. The reducef function will be called with no arguments to produce an identity value in each partition. The results of those reductions are then reduced with the combinef (defaults to reducef) function. When called with no arguments, (combinef) must produce its identity element - this will be called multiple times. Operations may be performed in parallel. Results will preserve order.
Until you are maxing out your machine, you should just stick to the basic reduce
function:
https://clojuredocs.org/clojure.core/reduce
This is essentially the same as Scala's fold
function:
(reduce + 0 [1 2 3 4 5]) => 15
where the function signature is:
(reduce <op> <init-val> <collection-to-be-reduced> )
Upvotes: 2