Node.js single thread Vs concurrency

Question

I have this ultra minimal Node.js server:

http.createServer(function (req, res) {
    var path = url.parse(req.url).pathname;

    if (path === "/") {
        res.writeHead(200, { 'Content-Type': 'text/plain' });
        res.write('Hello World');
        res.end('
');
    } else {
        if (!handler.handle(path, req, res)) {
            res.writeHead(404);
            res.write("404");
            res.end();
        }
    }
}).listen(port);

After doing some benchmarks I saw considerable degradation of performances under high concurrency (> 10000 concurrent connections). I started to dig deeper on Node.js concurrency and the more I search, the more I am confused...

I created a minimal example, out of the http paradigm in order to try understand things a bit better:

function sleep(duration) {
  return new Promise(resolve => setTimeout(resolve, duration));
}

var t0 = performance.now();

async function init() {
  await Promise.all([sleep(1000), sleep(1000), sleep(1000)]);

  var t1 = performance.now();
  console.log("Execution took " + (t1 - t0) + " milliseconds.")
}

init()
// Execution took 1000.299999985145 milliseconds.

From what I understand, Javascript operates on a single thread. That being said, I can't wrap my head around it acting like this:

             |   1 second   |
Thread #One   >>>>>>>>>>>>>>
Thread #One   >>>>>>>>>>>>>>
Thread #One   >>>>>>>>>>>>>>

... obviously this doesn't makes sense.

So, is this a terminology problem around thread Vs worker with something like this:

             |   1 second   |
Thread #One  (spawns 3 workers)
Worker #One   >>>>>>>>>>>>>>
Worker #Two   >>>>>>>>>>>>>>
Worker #Three >>>>>>>>>>>>>>

???

How is Node.js single-threaded but able to process three functions in parallel ?? If I am right with the parallel workers, is http spawning multiple workers for each incoming connections ?

T.J. Crowder · Accepted Answer

A thread in a JavaScript program works by servicing an task (event) / job queue. Conceptually, it's a loop: Pick up a job from the queue, run that job to completion, pick up the next job, run that job to completion.

With that in mind, your example with promises works like this:

Running Node.js with your input file parses the code and queues a job to run the top-level code in the script.
The main thread picks up that job and runs your top-level code, which:
1. Creates some functions
2. Does var t0 = performance.now();
3. Calls sleep(1000), which
  - Creates a promise
  - Sets a timer to do a callback in roughly 1000ms
  - Returns the promise
4. Calls sleep(1000) twice more. Now there are three promises, and three timer callbacks scheduled for roughly the same time.
5. Awaits Promise.all on those promises. This saves the state of the async function and returns a promise (which nothing uses, because nothing is using the return value of init).
6. Now the job to run the top-level code is complete.
The thread is now idling, waiting for a job to perform (an I/O completion, timer, etc.).
After about a second of idling, a job is queued to call the first timer callback. How this happens is implementation specific:
1. It might be that the main thread, as part of its loop checking the queue for work, also checks the list of timers to see if any of them are due to be run.
2. It might be that the timer system has its own non-JavaScript thread that does those checks (or gets a callback from an OS mechanism) and queues the job when a timer callback needs to run.
3. In the case of I/O (since timers are a stand-in for I/O in your example), Node.js uses a separate thread or threads to process I/O completions from the operating system and queue jobs for them in the JavaScript job queue.
The JavaScript thread picks up the job to call the timer callback. It does that, which fulfills that first sleep promise.
Fulfilling a promise queues a "microtask" (or "promise job") to call the promise's fulfillment handler (then handler) rather than a "macrotask" (standard job). Modern JavaScript engines handle promise jobs at the end of the standard job that queues them (even if there's another standard job already in the main queue waiting to be done — promise jobs jump the main queue). So:
1. Fulfilling the first sleep promise queues a promise job.
2. At the end of the standard job for the timer callback, the thread picks the promise job and runs it.
3. The promise job triggers a call to a fulfillment handler within Promise.all's logic that stores the fulfillment value and checks to see if all of the promises it's waiting for are settled. In this case, they aren't (two are still outstanding), so there's nothing further to do and the promise job is complete.
The thread goes back to the main queue.
Almost immediately, there's a job in the queue to call the next timer callback (or possibly two jobs to call both of them).
1. The thread picks up the first one and does it, which fulfills the second sleep promise, queuing a promise job to run its fulfillment handler
2. It picks up the queued promise job and calls the fulfillment hander in Promise.all's logic, which stores the fulfillment value and checks if all the promises are settled. They aren't, so it just returns.
3. It picks up the next standard job and does it, which fulfills the third sleep promise, queuing a promise job for its fulfillment handler.
4. It picks up that promise job, running the fulfillment handler in Promise.all's logic, which stores the fulfillment result, sees that all of the promises are settled, and queues a promise job to run its fulfillment handler.
5. It picks up that new promise job, which advances the state of the async init function:
  - The init function does var t1 = performance.now();, and shows the difference between t1 and t0, which is roughly one second.

There's only one JavaScript thread involved (in that code). It's running a loop, servicing jobs in the queue. There may be a separate thread for the timers, or the main thread may just be checking the timer list between jobs. (I'd have to dig into the Node.js code to know which, but I suspect the latter because I suspect they're using OS-specific scheduling features.) If it were I/O completions instead of timers, I'm fairly sure those are handled by a separate non-JavaScript thread which responds to completions from the OS and queues jobs for the JavaScript thread.

Node.js single thread Vs concurrency

Answers (1)

Related Questions