Reputation: 29502
I am currently developing a node js app with a REST API that exposes data from a mongo db.
The application needs to update some data every 5 minutes by calling an external service (could take more than one minute to get the new data).
I decided to isolate this task into a child_process but I am not sure about what should I need put in this child process :
I don't really know if there is a big cost to start a new child process every 5 minutes or if I should use only one long time running child process or if I am overthinking the problem ^^
EDIT - Inforamtion the update task
the update task can take up than one minute but it consists in many smaller tasks (gathering information from many external providers) than run asynchronously do many I don't even need a child process ?
Thanks !
Upvotes: 0
Views: 1424
Reputation: 707258
It does not matter the total clock time that it takes to get data from an external service as long as you are using asynchronous requests. What matters is how much CPU you are using in doing so. If the majority of the time is waiting for the external service to respond or to send the data, then your node.js server is just sitting idle most of the time and you probably do not need a child process.
Because node.js is asynchronous, it can happily have many open requests that are "in flight" that it is waiting for responses to and that takes very little system resources.
Because node.js is single threaded, it is CPU usage that typically drives the need for a child process. If it takes 5 minutes to get a response from an external service, but only 50ms of actual CPU time to process that request and do something with it, then you probably don't need a child process.
If it were me, I would separate out the code for communicating with the external service into a module of its own, but I would not add the complexity of a child process until you actually have some data that such a change is needed.
I don't really know if there is a big cost to start a new child process every 5 minutes or if I should use only one long time running child process or if I am overthinking the problem
There is definitely some cost to starting up a new child process. It's not huge, but if you're going to be doing it every 5 minutes and it doesn't take a huge amount of memory, then it's probably better to just start up the child process once, have it manage the scheduling of communicating with the external service entirely upon it's own and then it can communicate back results to your other node.js process as needed. This makes the 2nd node process much more self-contained and the only point of interaction between the two processes is to communicate an update. This separation of function and responsibility is generally considered a good thing. In a multi-developer project, you could more easily have different developers working on each app.
Upvotes: 1
Reputation: 1500
It depends on how cohesion between your app and the auto refresh task.
If the auto refresh task can running standalone, without interaction with your app, then it better to start your task as a new process. Use child_process directly is not a good idea, spawn/monitor/respawn child process is tricky, you can use crontab or pm2 to manage it.
If auto refresh task depends on your app, you can use child_process directly, send message to it for schedule. But first try to break this dependency, this will simplify your app, easy to deployment and maintain separately. Child process is long running or one shot is not a question until you have hundreds of such task running on one machine.
Upvotes: 1
Reputation: 19248
Node.js has an event-driven architecture capable of handling asynchronous calls hence it is unlike your typical C++ program where you will go with a multi-threaded/process architecture.
For your use-case I'm thinking maybe you can make use of the setInterval
to repeatedly perform an operation which you can define more tiny async calls through using some sort of promises
framework like bluebirdJS
?
For more information see:
setInterval: https://developer.mozilla.org/en-US/docs/Web/API/WindowTimers/setInterval
setInterval()
Repeatedly calls a function or executes a code snippet, with a fixed time delay between each call. Returns an intervalID.
Sample code:
setInterval(function() {
console.log("I was executed");
}, MILLISECONDS_IN_FIVE_MINUTE);
Promises: http://bluebirdjs.com/docs/features.html
Sample code:
new Promise(function(resolve, reject) {
updateExternalService(data)
.then(function(response) {
return this.parseExtResp(response);
})
.then(function(parsedResp) {
return this.refreshData(parsedResp);
})
.then(function(returnCode) {
console.log("yay updated external data source and refreshed");
return resolve();
})
.catch(function(error) {
// Handle error
console.log("oops something went wrong ->" + error.message);
return reject();
});
}
Upvotes: 2