monday
monday

Reputation: 319

What is a recommended way to separate computationally heavy tasks from nodejs-based API?

I have a NodeJS API that does basic things (like manipulating DB), but also, much less frequently, computationally taxing stuff like video encoding.

It seems like a better solution, from a scalability perspective, would be to split it into a primary API server and separate workers that do the heavy lifting. This way I have two pools, "low-cost" for primary API and "expensive" for workers, allowing for better resource management.

Is there a standard approach to this pattern?

The only way to deal with infrequent heavy requests seems to be either writing the worker servers from scratch or creating a child process. These options require a lot of extra code (including pooling, queuing etc) and the problem seems common enough, so the lack of, say, node-worker package indicates that my approach is both known and wrong.

Upvotes: 1

Views: 1807

Answers (2)

Yuci
Yuci

Reputation: 30079

In Node.js there are generally three approaches to handling computationally heavy tasks:

1. Use Node in the cluster mode

For best performance, we should make most of the CPU cores, either logically or physically. Node.js has a single main thread with a single event loop. At any given time, Node.js' main thread can only make use of one CPU core. What we can do is to start an array of Node.js instances/servers of the same number as CPU cores. Ideally we should be able to apply the load balancing and health checking, auto-restarting, etc., to the array of Node.js instances.

PM2 is a recommended open source Production Runtime and Process Manager for Node.js applications with a built-in Load Balancer. It allows you to keep applications alive forever, to reload them without downtime and facilitate common Devops tasks.

2. Use worker threads

Node.js also offers a thread pool (i.e., a Worker Pool). In Node there are two types of threads: one Event Loop (aka the main loop, main thread, event thread, etc.), and a pool of k Workers in a Worker Pool (aka the threadpool).

In order not to block the main thread, we should offload expensive tasks to the Worker threads.

WebWorker Threads is a recommended library to use for this approach.

3. Offload heavy computation to other servers

You can offload heavy computation to separate worker servers, having them communicate via a message queue server like RabbitMQ. This is probably the most scalable, flexible and reliable approach. The worker servers can be implemented in Node.js, Java, or any other appropriate technologies.

Upvotes: 2

swapz83
swapz83

Reputation: 422

Not sure if this is a question for serverfault, but here goes: Node is infamous for computationally expensive work, since it works on a single thread model. I suggest you use one of these alternatives:

  1. Segregate nodejs workers like you mentioned, but you need to have a layer above to delegate tasks
  2. Have expensive work done in "downstream services" which accept HTTP requests. Your node server can now connect to these and receive async responses once the work is done.
  3. Now the downstream services can be written in (a) A thread-heavy language like Java which allows heavy lifting of computational work, (b) lots of spun up node processes or spin-on-demand, (c) Exclusive Computational Infra like Amazon Lambda

Upvotes: 2

Related Questions