Reputation: 7306

Node.js/Express and parallel queues

We are building an infrastructure which features a Node.js server and Express.

In the server, what is happening is as follow:

The server accepts an incoming HTTP request from client.
Server generates two files (this operation can be "relatively long", meaning also 0.1 seconds or so)
Server uploads the generated files (~20-200 KB each) to an external CDN
Server responds to client, and this includes the URI of the file on the CDN

Currently the server is doing this sequentially for each request, and this works quite well (Node/Express can handle concurrent requests automatically). However, as we plan to grow, the number of concurrent requests may grow higher, and we believe it would be better for us to implement a queue for processing requests. Otherwise, we may risk having too many tasks running at the same time and too many open connections to the CDN. Responding to the client quickly is not a relevant thing.

What I was thinking about is to have a separate part in the Node server that contains a few "workers" (2-3, but we will do tests to determine the correct number of simultaneous operations). So, the new flow would look something like:

After accepting the request from the client, the server adds an operation to a queue.
There are 2-3 (to be tested) workers that take elements out of the queue and perform all the operations (generate the files and upload them to the CDN).
When the worker has processed the operation (doesn't matter if it stays in the queue for a relatively long time), it notifies the Node server (a callback), and the server responds to the client (which has been waiting in the meanwhile).

What do you think of this approach? Do you believe it is the correct one?

Mostly important, HOW could this be implemented in Node/Express?

Thank you for your time

Upvotes: 18

Answers (4)

Tarun Reddy

Reputation: 121

You can use Kue module with Redis(database to hold the jobs) Backing the queue. you create jobs and place them in a using kue module and you can put how many ever workers to work on them. useful links : kue - https://github.com/Automattic/kue

Upvotes: 0

ItalyPaleAle

Reputation: 7306

(Answering my own question)

According to this question on Stack Overflow a solution in my case would be to implement a queue using Caolan McMahon's async module.

The main application will create jobs and push them into a queue, which has a limit on the number of concurrent jobs that can run. This allows processing tasks concurrently but with a strict control on the limit. It works like Cocoa's NSOperationQueue on Mac OSX.

Upvotes: 7

Diosney

Reputation: 10580

tldr; You can use the native Node.js cluster module to handle a lot of concurrent requests.

Some preamble: Node.js per se is single threaded. Its Event Loop is what makes it excellent for handling multiple requests simultaneosly even in its single thread model is, which is one of its best features IMO.

The real deal: So, how can we scale this to even handle more concurrent conections and use all CPUs available? With the cluster module.

This module will work exactly as pointed by @Qualcuno, which will allows you to create multiple workers (aka process) behind the master to share the load and use more efficiently the CPUs availables.

According with Node.js official documentation:

Because workers are all separate processes, they can be killed or re-spawned depending on your program's needs, without affecting other workers. As long as there are some workers still alive, the server will continue to accept connections.

The required example:

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  // Fork workers.
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', function(worker, code, signal) {
    console.log('worker ' + worker.process.pid + ' died');
  });
} else {
  // Workers can share any TCP connection
  // In this case its a HTTP server
  http.createServer(function(req, res) {
    res.writeHead(200);
    res.end("hello world\n");
  }).listen(8000);
}

Hope this is what you need.

Comment if you have any further questions.

Upvotes: 32

TJC

Reputation: 727

To do this, i would use a structure like the one Heroku provides with Web/Worker Dynos (servers). The web servers can accept the requests and pass the info on to the workers, who can do the information processing and uploading. I would have the front-end site listen on a socket (socket.io) for the url of the external CDN which will be fired from the worker when the upload is finished. Hopefully that makes sense.

Upvotes: 1

Node.js/Express and parallel queues

Answers (4)

Related Questions