asanas
asanas

Reputation: 4280

Setting up a long-running task on nodejs

In my backend, I have about 5k users. Every day at 12pm UTC, I need to run a background node task, that carries out some real-time calculations on the user's data and sends it to the user by email/sms/notification, etc. This calculation is a bit intensive and takes a few seconds for each user.

Since 10k is a large number, I've created a schedule such that an API endpoint is called to run the task every 1 minute and processes 5 users on every call. The problem with this approach of processing 5 users per minute, is that at this speed it takes 16 hours to process all 5000 users. At the same time, the number of users is growing; soon enough, even 24 hours will not be enough.

What's the alternative? How can I process all these 5k users or even 10k users in a much shorter duration?

Upvotes: 2

Views: 464

Answers (1)

jfriend00
jfriend00

Reputation: 707238

Without seeing relevant code or you describing in significant detail what exactly this background node task is doing, we can't advise specifically on what approach or approaches you need to take as the answer totally depends upon understanding that task in detail.

If the nodejs background task is CPU bound, then you will need to involve more CPUs in the processing either with child_processes or WorkerThreads.

If the nodejs background tasks is database-limited, then you will need to scale your database to handle more requests in less time or redesign how the relevant data is stored to be more efficient.

If nearly all the processing is asynchronous and not CPU bound and not database-limited, then you perhaps need to be processing N users in parallel.

As with any performance problem, it's not good to guess where the biggest bottleneck is. You need to instrument and measure! I would suggest that you start by instrumenting the processing of one user to find out exactly where all the time is going. Then, start with the longest pole in the tent and dive into exactly why it's taking that much time. See how much you can improve that one item. Move onto the next one. Once you've made the processing of a single user as fast as it can be, then work on ways to process N users at a time and make sure your database can scale with that. By dividing up the users across more processes and processing N at a time in each process, you can scale the calculation work as much as you want. You will have to make sure that your database can keep up with that additional load from all the separate processes.

Upvotes: 2

Related Questions