Reputation: 7306
We are building an infrastructure which features a Node.js server and Express.
In the server, what is happening is as follow:
Currently the server is doing this sequentially for each request, and this works quite well (Node/Express can handle concurrent requests automatically). However, as we plan to grow, the number of concurrent requests may grow higher, and we believe it would be better for us to implement a queue for processing requests. Otherwise, we may risk having too many tasks running at the same time and too many open connections to the CDN. Responding to the client quickly is not a relevant thing.
What I was thinking about is to have a separate part in the Node server that contains a few "workers" (2-3, but we will do tests to determine the correct number of simultaneous operations). So, the new flow would look something like:
What do you think of this approach? Do you believe it is the correct one?
Mostly important, HOW could this be implemented in Node/Express?
Thank you for your time
Upvotes: 18
Views: 19447
Reputation: 121
You can use Kue module with Redis(database to hold the jobs) Backing the queue. you create jobs and place them in a using kue module and you can put how many ever workers to work on them. useful links : kue - https://github.com/Automattic/kue
Upvotes: 0
Reputation: 7306
(Answering my own question)
According to this question on Stack Overflow a solution in my case would be to implement a queue using Caolan McMahon's async module.
The main application will create jobs and push them into a queue, which has a limit on the number of concurrent jobs that can run. This allows processing tasks concurrently but with a strict control on the limit. It works like Cocoa's NSOperationQueue on Mac OSX.
Upvotes: 7
Reputation: 10580
tldr; You can use the native Node.js cluster module to handle a lot of concurrent requests.
Some preamble: Node.js per se is single threaded. Its Event Loop is what makes it excellent for handling multiple requests simultaneosly even in its single thread model is, which is one of its best features IMO.
The real deal: So, how can we scale this to even handle more concurrent conections and use all CPUs available? With the cluster module.
This module will work exactly as pointed by @Qualcuno, which will allows you to create multiple workers (aka process) behind the master to share the load and use more efficiently the CPUs availables.
According with Node.js official documentation:
Because workers are all separate processes, they can be killed or re-spawned depending on your program's needs, without affecting other workers. As long as there are some workers still alive, the server will continue to accept connections.
The required example:
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('worker ' + worker.process.pid + ' died');
});
} else {
// Workers can share any TCP connection
// In this case its a HTTP server
http.createServer(function(req, res) {
res.writeHead(200);
res.end("hello world\n");
}).listen(8000);
}
Hope this is what you need.
Comment if you have any further questions.
Upvotes: 32
Reputation: 727
To do this, i would use a structure like the one Heroku provides with Web/Worker Dynos (servers). The web servers can accept the requests and pass the info on to the workers, who can do the information processing and uploading. I would have the front-end site listen on a socket (socket.io) for the url of the external CDN which will be fired from the worker when the upload is finished. Hopefully that makes sense.
Upvotes: 1