Reputation: 599
I am trying to understand the core principles of non-blocking programming (and frameworks like project reactor). The main idea is to have "thread pool" with determined number of threads (executors) and tasks which are executed there. We should not have any blocked threads. In "user code" we just run something to execute and leave callback (what to do with the result). Out "user" thread is not blocked, right. But what if my task depends on some jdbc query. My task will request this query and then will be blocked waiting for the result, right? So, this thread is blocked.
But we avoid thread creating (which is expensive). Is it the core benefit of this style?
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
Upvotes: 4
Views: 3156
Reputation: 206786
Threads are relatively costly system resources. For example, each thread needs memory for the call stack. How much this is depends on the operating system, but typically it's something like 1 or 2 MB. This means it's not a good idea to start thousands of threads - you'd waste 1 or 2 GB memory just on the call stacks of 1000 threads.
So, to do things more efficiently you want to limit the number of threads, for example using a thread pool to handle work. The thread pool makes it possible to manage the number of threads that are being used.
However, imagine that you'd have a thread pool with 10 threads, and then 10 requests come in. Each of your threads will be reserved to handle a request. While they are busy, you can't handle request #11 because there is no thread free. When you are using blocking I/O, then, even though all your 10 threads are doing nothing (waiting for I/O to complete), request #11 cannot be handled...
When you use non-blocking I/O, threads will never need to wait for I/O - so when the handling request #3 is suspended because it needs the result of an I/O operation, the thread that was handling it can temporarily switch to handling other requests.
So, with non-blocking I/O, you never have waiting threads and you are using system resources more efficiently.
This will only work if you are using non-blocking I/O from the front to the back of your system. If at the back-end you are using JDBC, which is a blocking API, then you'll loose the full benefit of non-blocking I/O.
Therefore, if you have a database at the back-end, this works best if you have a DB which supports non-blocking I/O. Some NoSQL databases like MongoDB support this, and for some relational databases there are special drivers / APIs available that support this. You won't be using JDBC in that case, because JDBC is an inherently blocking API.
Oracle is working on a new API for relational databases tentatively called ADBA which will allow you to do non-blocking / async I/O with relational databases but it's not ready yet.
Upvotes: 16
Reputation: 31
Project Reactor is an implementation of Reactive Streams specification. The specification overview can be found at ReactiveManifest. It's not just creating a set of threads and letting them do their jobs, It's the framework or the runtime (in this case ProjectReactor) that will organize your code in such a way that it'll presumably behave as nonblocking. Also, the whole system implementation has to be in this fashion otherwise you won't be benefited from the reactive streams.
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
The answer to this will be yes, and no. The framework may are may not create threads. Since the code will be interleaved among the threads, Since the non-blocking system are event-driven including the low-level operations (ex, libuv I/O), It's not necessary for a thread to wait for the completion of an I/O operation. Meanwhile, the thread will be executing something meaningful. The completion of the task will be notified and the dependent code can be executed by any of the available thread. The goal of such a system is to utilize the CPU to the fullest with limited resources(threads).
Taken from http://www.reactive-streams.org. The main goal of Reactive Streams is to govern the exchange of stream data across an asynchronous boundary—think passing elements on to another thread or thread-pool—while ensuring that the receiving side is not forced to buffer arbitrary amounts of data. In other words, back pressure is an integral part of this model in order to allow the queues which mediate between threads to be bounded. The benefits of asynchronous processing would be negated if the communication of back pressure were synchronous (see also the Reactive Manifesto), therefore care has to be taken to mandate fully non-blocking and asynchronous behavior of all aspects of a Reactive Streams implementation.
It's the Reactor framework that enforces and help you in building a completely non-blocking system from the ground up.
Upvotes: 2