Reputation: 1200
I have software that monitors the health of several linux machines on a local network. One of the checks it does is ping all of the machines periodically to ensure that they are responsive.
It has recently come to my attention that one or more machines can be in a kernel panic state yet still respond to ping. I'd like to know if there's some sort of check I can do in C++ that returns true when either:
a) Remote machine is unresponsive (currently doing this with ping statements). b) Remote machine is responsive, but in a kernel panic state.
The thing is, I don't know what works and what doesn't during a kernel panic.
This is on RHEL 5.7 if that helps. Thanks in advance!
Upvotes: 1
Views: 1208
Reputation: 15205
The answer to that is: it depends. Sometimes kernel panics will even stop ping responses. The definition of "unresponsive" will depend on the use-case of the machine. If there's a way to ascertain that the machines main purpose is still achievable locally you may be able to use SNMP and/or web (or some other network protocol) to make sure it responds.
One common monitoring method (with lots of pre-made plugins for a wide range of vectors and services) is to use nagios, icinga, or some such tool.
Upvotes: 2