Reputation: 1639
I'm using node.js with cluster, typically with 2 cpu's which translate to one master and two workers. I am having a sneaky problem, where occasionaly (very rarely), one of the workers gets 'stuck' for some reason, and the other bares all of the load. I am not sure of the cause and still investigating (no memory leak, no stack overflow, no exception).
When looking at the processes using top bash command on linux, I can clearly see that one of the node processes is steady at 100% cpu load.
What I want to ask of you guys today, is whether you know of a way to detect this situation (when one worker is at 100%) so I can kill it off.
Upvotes: 0
Views: 1261
Reputation: 1639
OK, So here goes. Turns out my worker gets absolutely stuck. Don't know why, but it may be a cluster problem (what you call a cluster %^&$) Anyway, I had to monitor workers by the master. What I did is use cron to report from each worker every minute to the master, like so:
process.send({id:cluster.worker.id})
The master would receive that message and know that this worker is alive and well. The master then keeps a count of missing worker responses. After 5 minutes, the worker is killed if the count reaches 0 (decremented once every minute)
This is how I achieved (my own) goal of killing a stuck worker after a few minutes. This is not a complete solution, and I still don't know what causes the workers to get stuck without any exception. But that is life right now.
Upvotes: 2
Reputation: 588
Check out usage package. Something like this should work. I skipped cluster and worker setup.
var usage = require('usage');
setInterval(function() {
usage.lookup(worker.process.pid, function(err, result) {
console.log(result);
if(result.cpu > 90){
worker.kill();
}
});
}, 5000)
Upvotes: 0