Reputation: 13872
I was reading https://dzone.com/articles/think-twice-using-java-8
Somewhere in between it states that
The problem is that all parallel streams use common fork-join thread pool, and if you submit a long-running task, you effectively block all threads in the pool.
My question is - shouldn't other threads in pool complete without waiting on long running task? OR is it talking about if we create two parallel streams parallely?
Upvotes: 1
Views: 1196
Reputation: 298133
A Stream operation does not block threads of the pool, it will utilize them. Depending on the workload split, it is possible that all threads are busy processing the Stream operation that was commenced first, so they can not pick up workload for another Stream operation. The article seems to wrongly use the word “block” for this scenario.
It’s worth noting that the Stream API and default implementation is designed for CPU bound task which do not wait for external events (block a thread). If you use it that way, it doesn’t matter which task keeps the threads busy for the overall throughput. But if you are processing different requests concurrently and want some kind of fairness in worker thread assignment, it won’t work.
If you read on in the article you see that they created an example assuming a wrong use of the Stream API, with truly blocking operations, and even call the first example broken, though they are putting it in quotes unnecessarily. In that case, the error is not using a parallel Stream but using it for blocking operations.
It’s also not correct that such a parallel Stream operation can “block all other tasks that are using parallel streams”. To have another parallel Stream operation, you must have at least one runnable thread initiating the Stream operation. Since this initiating thread will contribute to the Stream processing, there’s always at least one participating thread. So if all threads of the common pool work on one Stream operation, it may degrade the performance of other parallel Stream operations, but not bring them to halt.
E.g., if you use the following test program
long t0 = System.nanoTime();
new Thread(() -> {
Stream.generate(() -> {
long missing = TimeUnit.SECONDS.toNanos(3) + t0 - System.nanoTime();
if(missing > 0) {
System.out.println("blocking "+Thread.currentThread().getName());
LockSupport.parkNanos(missing);
}
return "result";
}).parallel().limit(100).forEach(result -> {});
System.out.println("first (blocking) operation finished");
}).start();
for(int i = 0; i< 4; i++) {
new Thread(() -> {
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
System.out.println(Thread.currentThread().getName()
+" starting another parallel Stream");
Object[] threads =
Stream.generate(() -> Thread.currentThread().getName())
.parallel().limit(100).distinct().toArray();
System.out.println("finished using "+Arrays.toString(threads));
}).start();
}
it may print something like
blocking ForkJoinPool.commonPool-worker-5
blocking ForkJoinPool.commonPool-worker-13
blocking Thread-0
blocking ForkJoinPool.commonPool-worker-7
blocking ForkJoinPool.commonPool-worker-15
blocking ForkJoinPool.commonPool-worker-11
blocking ForkJoinPool.commonPool-worker-9
blocking ForkJoinPool.commonPool-worker-3
Thread-2 starting another parallel Stream
Thread-4 starting another parallel Stream
Thread-1 starting another parallel Stream
Thread-3 starting another parallel Stream
finished using [Thread-4]
finished using [Thread-2]
finished using [Thread-3]
finished using [Thread-1]
first (blocking) operation finished
(details may vary)
There might be a clash between the thread management that created the initiating threads (those accepting external requests, for example) and the common pool, however. But, as said, parallel Stream operations are not the right tool if you want fairness between a number of independent operations.
Upvotes: 4