Reputation: 117

How does Node.js use fewer threads to handle multiple connections?

I've got no problem with events and callbacks, synchrony/asynchrony, the call stack and the queue.

However, as I understand it, other servers make a new thread for each connection which contain both the blocking request and handler for the response of that request where as in node this handler would be passed to the main thread as a callback. The ability of this kind server to handle multiple requests is therefore limited by it's ability to create and switch between multiple threads.

When Node receives a blocking request it sends it into asynchrony land while it carries on processing the main thread. What happens in asynchrony land, doesn't a thread still need to be created to await the response for that request and then to sent the event to node event loop? If so, why isn't Node as limited by the server's ability to create and switch between threads? If not, what happens to the request?

Upvotes: 1

Answers (1)

Chad Robinson

Reputation: 4623

I think there's some confusion over how the event loop actually works. NodeJS doesn't "receive a blocking request" and "send it into asynchrony land". It's asynchronous to begin with - unless you call a ...Sync() pattern function, EVERY call and EVERY operation is async. Confusingly, once you are inside your CODE, EVERY operation is synchronous.

It's a "cooperative multitasking" approach - all calls to the system are expected to "start the ball rolling" and return immediately, while your own code is suppose to do what it needs to do as quickly as possible and yield control back to the JSVM (by returning from your function).

To understand how this works when you're dealing with network communications, you need to go back in time to before threads really even existed. In the early days, if you had multiple network connections, your single-threaded process would have to put together a list of all the sockets it wanted information on (such as "has data arrived for me to read?"), and ask the OS if that was true by calling select(). This would a yes/no for each socket for each question. This was typically done in a while() loop that ran until the program was terminated. You would ask for a list of sockets with new data, read that data, do something with it, and then go back to sleep, over and over again.

NodeJS is far more sophisticated but this analogy works well for it. It has a main "event loop" that is constantly sleeping until there is work to do, then waking up and doing it.

Everything that you do comes from, or goes into, this channel. If you write data to a network socket, and ask to be notified (called back) when it's done, NodeJS passes your request to the operating system and then goes to sleep. You stop running. Your context is saved - all your local vars are saved. When the OS comes back and says "done!", NodeJS checks its list and sees you wanted to know about this, and calls your function, reloading your context so all your local vars are where you need them.

To be very clear, it is entirely possible that when the data is finished being written to the network, and the OS notification comes back for that, NodeJS is busy with other work! NodeJS won't "create a thread" to handle it - it'll ignore it completely until it gets some free time! It won't be lost... it just won't be handled "yet".

This drives programmers used to threading models nuts - it seems illogical that this constant state of never immediately responding to an incoming event "until it has a chance" could possibly be efficient. But software architectures are often deceiving. Threading models actually have fairly high overhead. CPU core counts aren't infinite - the entire computer as a whole is doing plenty of work all the time. Threads aren't free - just because you make one doesn't mean the CPU itself has time to do anything with it. And the overhead of thread creation and management often means an efficiency loss.

Old-school event-loop models eliminate this overhead. When things go badly like you have an infinite loop in your code, they can behave very badly - often locking up completely. But when things are going well they can actually be a lot faster, and many benchmarks have shown that well-written NodeJS modules can perform as well as or even better than similar modules in other languages.

In summary, the most common confusion in NodeJS is what "async" really means. A good way to think of it is that in threading models, programmers are expected to be "bad"/simplistic (write blocking code and just wait for things to return) and the core VM or OS is expected to be "good"/smart (tolerate this by making threads to handle async work). In NodeJS, programmers are expected to be "good"/sophisticated (write well-structured async code), allowing the JSVM to focus on what it does best and not need as much magic to make things work well. Well-used, NodeJS puts a lot of power in your hands.

Upvotes: 2

How does Node.js use fewer threads to handle multiple connections?

Answers (1)

Related Questions