Will this make a faster parallel stream?

Question

The OCP book says that all streams are ordered by default but that it is possible to turn an ordered stream into an unordered stream using the unordered() method.

It also says that this method can greatly improve performance when I use this method as an intermediate operation before calling the parallel() terminal operation. My question is: Will the below parallelstream be faster then the one below that one?

Arrays.asList(1,2,3,4,5,6).stream().unordered().parallel()

Arrays.asList(1,2,3,4,5,6).parallelStream().

PS: I know a parallelstream doesent increase performance when working with a small collection, but lets pretend we are working with a very large collection here.

The second stream is still ordered right? So will the first one have better performance?

Thank you

assylias · Accepted Answer

You state that all streams are ordered by default: that's not the case. For example if your source is a HashSet, the resulting stream will not be ordered.

Regarding your question on making a parallel stream unordered to "greatly improve performance": as always when it comes to performance, it depends (on the terminal operation, on the intermediate operations, on the size of the stream etc.)

The java.util.stream package javadoc gives some pointers that answer your question, at least in part:

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.

Will this make a faster parallel stream?

Answers (2)

Related Questions