Vasco
Vasco

Reputation: 415

scala parallel collections not consistent

I am getting inconsistent answers from the following code which I find odd.

import scala.math.pow

val p = 2
val a = Array(1,2,3)

println(a.par
    .aggregate("0")((x, y) => s"$y pow $p; ", (x, y) => x + y))

for (i <- 1 to 100) {
  println(a.par
    .aggregate(0.0)((x, y) => pow(y, p), (x, y) => x + y) == 14)
}

a.map(x => pow(x,p)).sum

In the code the a.par ... computes 14 or 10. Can anyone provide an explanation for why it is computing inconsistently?

Upvotes: 0

Views: 78

Answers (2)

fxlae
fxlae

Reputation: 905

In your "seqop" function, that is the first function you pass to aggregate, you define the logic that is used to combine elements within the same partition. Your function looks like this:

(x, y) => pow(y, p)

The problem is that you don't accumulate the results of a partition. Instead, you throw away your accumulator x. Every time you get 10 as a result, the calculation 2^2 was dropped.

If you change your function to take the accumulated value into account, you will get 14 every time:

(x, y) => x + pow(y, p)

Upvotes: 6

zhang rui
zhang rui

Reputation: 81

The correct way to use aggregate is

    a.par.aggregate(0.0)(
        (acc, value) => acc + pow(value, 2), (acc1, acc2) => acc1 + acc2
    )

By using (x,y) => pow(y,2) , you did not accumulate the item to the accumulator but just replaced the accumulator by pow(y,2).

Upvotes: 1

Related Questions