Reputation: 3176
Does the placement of calls to sequential()
and parallel()
change how a Java 8 stream's pipeline is executed?
For example, suppose I have this code:
new ArrayList().stream().parallel().filter(...).count();
In this example, it's pretty clear that filter()
will run in parallel. However, what if I have this code:
new ArrayList().stream().filter(...).parallel().count();
Does filter()
still run in parallel or does it run sequentially? The reason it's not clear is because intermediate operations like filter()
are lazy, i.e., they won't run until a terminal operation is invoked like count()
. As such, by the time count()
is invoked, we have a parallel stream pipeline but is filter()
performed sequentially because it came before the call to parallel()
?
Upvotes: 11
Views: 1011
Reputation: 298579
Note the end of the Stream
’s class documentation:
Stream pipelines may execute either sequentially or in parallel. This execution mode is a property of the stream. Streams are created with an initial choice of sequential or parallel execution. (For example, Collection.stream() creates a sequential stream, and Collection.parallelStream() creates a parallel one.) This choice of execution mode may be modified by the BaseStream.sequential() or BaseStream.parallel() methods, and may be queried with the BaseStream.isParallel() method.
In other words, calling sequential()
or parallel()
only changes a property of the stream and its state at the point when the terminal operation is commenced determines the execution mode of the entire pipeline.
This might not be documented that clearly at all places, because, it wasn’t always so. In the early development there were prototypes having different mode for the stages. This mail from March 2013 explains the change.
Upvotes: 10
Reputation: 34648
It appears that at least in the standard Oracle Java 8 implementation, although the parallel()
method is defined as an "intermediate operation", it is not exactly lazy. That is, it has an immediate effect, regardless of whether you have a terminal operation or not. Consider the following example:
public class SimpleTest {
public static void main(String[] args) {
Stream<Integer> s = Stream.of(1,2,3,4,5,6,7,8,9,10);
System.out.println(s.isParallel());
Stream<Integer> s1 = s.parallel();
System.out.println(s.isParallel());
System.out.println(s == s1);
}
}
The output on my machine is:
false true true
Which tells us that parallel()
immediately changes the state of the underlying stream (and returns that stream).
However, the Javadoc is written in such a way that it allows this, but does not require this. Which means that other stream implementations are free to execute the operations before the parallel()
operations in a different execution mode than those after it.
In short, it's not a behavior you can rely on, either way.
Upvotes: 4