Reputation: 4443
I'm currently monitoring a large network with Hobbit and have been tasked with lowering the amount of false (or at least irrelevant) alarms. At the top of my list are the tests "http" and "conn", initiated by bbtest-net. This command checks ping, ssh, etc, and if for instance a ping times out, it immediately sets the status to red. One minute later, the bbretest command kicks in, checks all the newly reddened hosts, and finds it to be green again. This happens all the time, and it clutters up my log.
Is there any way for me to make Hobbit report a red status AFTER bbretest has been run the first time?
Upvotes: 1
Views: 519
Reputation:
You can use:
<ip> <hostname> # noconn
In bb-hosts for a server that doesn't respond to ping. Then test its aliveness through a service.
Upvotes: 0
Reputation: 63538
I think your best bet is to shun the stock Hobbit service tests and write your own one. It's not difficult.
It is a good idea that your test script will not go red unless several successive attempts fail.
You can disable the standard Hobbit ones and use your own instead. Having said that, the default behaviour of the "conn" test seems fairly reasonable (going red immediately if the server doesn't ping).
Unfortunately there's no option on the Hobbit alerting system to only alert if a problem persists for X minutes, that would be really useful - but I'm sure you could do that as well with a custom alerting script.
Upvotes: 0
Reputation: 391346
First, this is a programming site so you won't get many answers.
But.... but ...
If your server times out, isn't that a problem?
Sounds to me like Hobbit does the job it is designed for: Telling you that you have something that needs your attention.
Fix the timeout problem, and your log should be fine.
Upvotes: 2