Maurice
Maurice

Reputation: 7381

Will this make a faster parallel stream?

The OCP book says that all streams are ordered by default but that it is possible to turn an ordered stream into an unordered stream using the unordered() method.

It also says that this method can greatly improve performance when I use this method as an intermediate operation before calling the parallel() terminal operation. My question is: Will the below parallelstream be faster then the one below that one?

Arrays.asList(1,2,3,4,5,6).stream().unordered().parallel()

Arrays.asList(1,2,3,4,5,6).parallelStream().

PS: I know a parallelstream doesent increase performance when working with a small collection, but lets pretend we are working with a very large collection here.

The second stream is still ordered right? So will the first one have better performance?

Thank you

Upvotes: 0

Views: 1005

Answers (2)

Eugene
Eugene

Reputation: 120848

For the case that you have shown here, absolutely not. There are way too few elements here. Generally you should measure and then conclude, but this one is almost a no-brainer.

Also read this: Parallel Processing

The thing about unordered is that while executing the terminal operation, the Stream pipeline has to mention order - that means additional costs. If there is no order to maintain, the stream is faster.

Notice that once you called unordered there is no way to get that order back. You could sort, but that might not mean the initial order.

Same goes for findFirst for example and findAny in a parallel process.

Upvotes: 1

assylias
assylias

Reputation: 328598

You state that all streams are ordered by default: that's not the case. For example if your source is a HashSet, the resulting stream will not be ordered.

Regarding your question on making a parallel stream unordered to "greatly improve performance": as always when it comes to performance, it depends (on the terminal operation, on the intermediate operations, on the size of the stream etc.)

The java.util.stream package javadoc gives some pointers that answer your question, at least in part:

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.

Upvotes: 3

Related Questions