IUnknown
IUnknown

Reputation: 9809

Parallelizing reduce operation

Could a usual 'max' operation as below be distributed as a reduce operation on multiple cores by simply changing stream to parallelstream?
How does the final reconcile happen (without an explicit combiner) between the outcomes of the different threads?

List<Employee> emps = new ArrayList<>();
emps.add(new Employee("Roy1",32));
emps.add(new Employee("Roy2",12));
emps.add(new Employee("Roy3",22));
emps.add(new Employee("Roy4",42));
emps.add(new Employee("Roy5",52));

Integer maxSal= emps.parallelStream().mapToInt(e -> e.getSalary()).reduce((a,b)->Math.max(a, b)).getAsInt();

System.out.println("Maximum in parallel " + maxSal);

Upvotes: 1

Views: 178

Answers (2)

Jean-Baptiste Yun&#232;s
Jean-Baptiste Yun&#232;s

Reputation: 36401

There is no problem on parallelizing a stream provided that operations (including inner ones) have good properties. API says (for reduction):

[...] a properly constructed reduce operation is inherently parallelizable, so long as the function(s) used to process the elements are associative and stateless. [...]

And there is a definition (in the docs) of associative and stateless :

Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element -- each element can be processed independently of operations on other elements.

and

An operator or function op is associative if the following holds:

 (a op b) op c == a op (b op c)  

The importance of this to parallel evaluation can be seen if we expand this to four terms:

 a op b op c op d == (a op b) op (c op d)  

So we can evaluate (a op b) in parallel with (c op d), and then invoke op on the results.

Roughly, if operation is associative and stateless that means that you can apply it in any order you want to get the result. So any good generic gathering can be applied. Remember that Java 8 streams are based on Fork-Join pool, so the gathering is well-known. Tutorial on the subject shows the basics:

if (my portion of the work is small enough)
  do the work directly
else
  split my work into two pieces
  invoke the two pieces and wait for the results

Upvotes: 1

ZhekaKozlov
ZhekaKozlov

Reputation: 39536

Yes, reduce can be parallelized. However, this requires that you pass an associative operator. Excerpt from java.util.stream JavaDoc:

Associativity

An operator or function op is associative if the following holds:

(a op b) op c == a op (b op c)

The importance of this to parallel evaluation can be seen if we expand this to four terms:

a op b op c op d == (a op b) op (c op d)

So we can evaluate (a op b) in parallel with (c op d), and then invoke op on the results. Examples of associative operations include numeric addition, min, and max, and string concatenation.

Upvotes: 1

Related Questions