Dhruv Shah
Dhruv Shah

Reputation: 1651

How to handle a CronJob that queries and updates a Table in SQL Database using nodejs?

I need to run a CronJob that performs three inter-dependent async tasks, at certain interval that is mentioned in the CronJob config.

Async Task-1: Query Table to fetch results on a particular criteria

Async Task-2: Perform a async operation on the results fetched in Task-1

Async Task-3: Update Table entries for corresponding Ids with operation performed in Task-2.

I am unable to figure out, what would happen if the next the next interval of CronJob begins before the tasks of first interval end. And how can this be managed.

More specific question: Is there a way in which I can maintain a sync between the sql table and tasks being performed, so that if an UPDATE TASK is pending in one cycle, it doesnt perform the same task in the next cycle.

I am using node-cron npm module for developing the CronJob.

Upvotes: 0

Views: 884

Answers (1)

Vitor Baptista
Vitor Baptista

Reputation: 2096

Unfortunately, cron doesn't support dependency between jobs, so you have to handle this yourself. You have basically two options:

  • Merging the tasks into a single one
  • Having a flag somewhere that lets Task-n know if Task-n-1 has finished successfully

Your life will be much simpler if you're able to merge the tasks, as you can use the tools you're used to in JavaScript. If not, you could do something like:

  • Async Task-1 queries the DB and saves the result to a known place (e.g. 2018-08-31-task-1-results.csv)
  • Async Task-2 checks if 2018-08-31-task-1-results.csv exists. If it does, it knows that the previous task was successful, and can process the file and save the output to another file (e.g. 2018-08-31-task-2-results.csv)
  • Async Task-3 proceeds similarly as Async Task-2.

In other words, the tasks aren't dependent on each other directly, but on the output generated by the previous tasks. This allows you to re-run the tasks and have a log on their outputs. My example was using files, but it can be anything that all tasks can access, like an intermediate table.

In the future, if you keep having to handwrite these dependency chains, I'd suggest considering one of the many task pipeline frameworks like Luigi and Airflow.

Upvotes: 2

Related Questions