Luftbaum
Luftbaum

Reputation: 323

How to use parallel streaming safely in java?

I am using parallel streams on a big calculation, but the results were strange when I executed it. When doing everything in pure [SEQ]-process-scheduling, it just works fine.

To find the problem I use a simple calculation ( The job is to add all natural numbers from 'iterations' to zero together ):

private long sum = 0;                  // global helper

public static void main(String args[])
{
    int iterations = 50000;            // the target number

    // solving the task parallel
    // ---------------------------------- <Section-under-Test>.START
    final long    startPAR =     System.nanoTime();
    IntStream.range(1, iterations+1).parallel().forEach(i->{

        sum += i;
    });
    final long durationPAR = ( ( System.nanoTime() - startPAR ) / 1_000_000 );
    // ---------------------------------- <Section-under-Test>.FINISH
    long sumParallel = sum; sum = 0;   // save + reset variable

    System.out.println( "Sum parallel: "  + sumParallel
                      + "         TOOK: " + durationPAR
                        );

    // solving the task linear using one core
    // ---------------------------------- <Section-under-Test>.START
    final long    startSEQ =     System.nanoTime();

    for(int i = 1; i < iterations+1; i++)
    {
        sum += i;
    }
    final long durationSEQ = ( ( System.nanoTime() - startSEQ ) / 1_000_000 );
    // ---------------------------------- <Section-under-Test>.FINISH

    System.out.println( "Sum serial:   "  + sum
                      + "         TOOK: " + durationSEQ
                        );
    System.out.println((sum == sumParallel));
}

Strangely I get different outputs each time I execute the parallel part:

Sum parallel: 354519954
Sum serial:  1250025000
false
---------------------------
Sum parallel: 453345292
Sum serial:  1250025000
false
---------------------------
Sum parallel: 613823840
Sum serial:  1250025000
false

So what I want to know is:

Did I miss the point and wanted to use the parallel computing in a wrong place?

Info:

In my bigger calculation, I am calculating parallel that value to add to, in this case, the sum. So that works just fine. But how do I properly add this result to a global variable?

Upvotes: 1

Views: 66

Answers (2)

nickb
nickb

Reputation: 59709

Because sum += i; is not atomic, it involves multiple operations:

  1. Read i
  2. Read sum
  3. Compute sum plus i
  4. Assign result to sum

At any point during the execution of those operations, we have other threads performing the same operations. If two threads read sum (operation 2) at the same time, you're guaranteed to get different results because when they both calculate sum + i, neither thread is taking the other's addition into consideration.

If you wanted to do such a calculation in parallel, use AtomicLong instead. Operations such as addAndGet are guaranteed to be atomic, that is, it's guaranteed to occur in one step and not be broken up into the above steps.

Upvotes: 2

diginoise
diginoise

Reputation: 7630

Just use the .sum() method, so that no thread operates on a shared state, hence no need for synchronized or atomic access:

int sum = IntStream.rangeClosed(1, iterations).parallel().sum();

If you need more control over the calculation, you can use a .reduce() method:

int sum = IntStream.rangeClosed(1, iterations).parallel().reduce(0, (a, b) -> a + b);

Upvotes: 2

Related Questions