user393219
user393219

Reputation:

Watching NodeJS Clusters For Exit

I'm having a hard time wrapping my head around having a node.js process (something asynchronously) run but still triggering an 'exit' state so I can do more when CPU-crunching is complete.

For example, I've got a Google Places crawler that efficiently distributes http requests across all available CPUs.

} else if (cluster.isWorker) {
//  Code to run if we're in a worker process

// Send the object we created above from variables so they're available to the workers
process.on('message', function(clusterDivisionObject) {
    var tempArray;

    // Send the chunk of array appropriate for this cluster to process, then request it's place details
    tempArray = clusterDivisionObject.placeIdArray.splice(((cluster.worker.id * clusterDivisionObject.clusterDivision) - clusterDivisionObject.clusterDivision), clusterDivisionObject.clusterDivision);
    tempArray.forEach(function(arrayItem, index, array){
      request({url: config.detailsRequestURI + '?key=' + config.apiKey + '&placeid=' + arrayItem, headers: config.headers}, detailsRequest);
    });
});
}

The real issue here is the last line where I sent an asynchronous request() statement. The code executes correctly but once I hit the callback (detailsRequest) to do something (in this case, write to a json file) I don't have any control for exiting the process. My callback function:

function detailsRequest(error, response, body) {
    if (!error && response.statusCode == 200) {
        var detailsBody = JSON.parse(body);
        ...
    }
}

...Has no awareness as to what process is running it or how many iterations it has made (to trigger an exit after the entire tempArray is exhausted). So, assuming that one cluster is running request() for tempArray of x length, how can I trigger process.exit(0) when that tempArray.forEach(){} has completed?

I've tried calling a process.exit(0) directly after tempArray.forEach(){} but the process will die before request() is even ran. Is there any efficient way I can watch a process better to call it's exit or am I really trying to tackle a problem that can't exist since request() is async and could be called or not called in any order?

Upvotes: 0

Views: 1110

Answers (1)

Yousef
Yousef

Reputation: 401

You need async flow control. You don't want your process exiting until all requests complete. Instead, you're asking node send out all these requests and then exit the process. Checkout async.js or some other flow control library. But you need something like this:

var tempArray;
var counter = 0;

tempArray = []; // same as above

// Without asyncjs
tempArray.forEach(function(arrayItem, index, array){
  request({url: config.detailsRequestURI + '?key=' + config.apiKey +'&placeid=' + arrayItem, headers: config.headers}, detailsRequest);
});

function detailsRequest(){ 
 // increment counter and handle response
 // this callback gets called N times.
 counter +=1;
 if(counter >= tempArray.length){ process.exit(0); }
}


//With async.js:

async.map(tempArray, sendRequestFunc, function finalDone(err, results){ 
  // here you can check results array which has response
  // and then exit
  process.exit(0);
}); 

function sendRequestFunc(el, done){ 
  // done callback as per async docs
  // done must be invoked here or the final callback is never triggered 
  request({url:'same as above'}, done)
}

Keep in mind that you may need to add additional checks for errors or bad responses and handle those accordingly.

The done callback within the sendRequestFunc is only invoked when the request returns a response or error (async) and the last async callback 'finalDone' is invoked only when all the responses have returned.

Upvotes: 1

Related Questions