Reputation: 2283
I have two hosts for which the hosts checks are no longer working (because the pings are stopped by a firewall), causing Nagios to send notifications about them and list them as DOWN
and coloured red. I want to temporarily disable the host checks for these hosts (but not remove them, or disable the checks of the services on them, since those work fine). What is the best way to do this?
I have tried changing their definitions to use generic-host
instead of use linux-server
. Those templates are defined as follows:
define host{
name linux-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
max_check_attempts 1
}
I had expected the host checks to stop, since generic-host
has no check_command
configured, but instead they continue (no idea what command Nagios is actually running) and the hosts stay on status DOWN
.
I have also tried to add an empty check_command
line to the definitions of the hosts, to override the check_command
parameter to be blank, which the Nagios docs say should disable host checks, but then Nagios does not accept the configuration, saying that there "is no command named ''".
What I want is for Nagios to stop doing host checks for these hosts, and for the status to go back to OK
/UP
. What is the proper way to achieve that?
Upvotes: 0
Views: 9387
Reputation: 53
Have you tried clearing retention.dat? This would wipe out all current host and service status, all comments, downtime, etc. Essentially start with a clean state.
> cd /usr/local/nagios/var
> (optional) cp retention.dat retention.dat.backup
> rm retention.dat
> service nagios restart
Edit: This should go in tandem with the other solutions here.. disable host checks first, THEN do a reset by clearing retention data. This may or may not get the host states where you want them, but they'll no longer be DOWN or throwing notifications
Upvotes: 0
Reputation: 252
You can force a host check to always return OK by using the check_dummy check command.
Place the following command definition in, for example, your commands.cfg file:
# 'check_dummy' command definition
# NOTE: This command always returns an 'OK' result no matter what.
define command{
command_name check_dummy
command_line $USER1$/check_dummy 0
}
Then in your host definition add the following line:
check_command check_dummy
Restart the nagios service and your non-pingable host will now always be 'UP'.
Upvotes: 1
Reputation: 1328
You have several options and You do not even need edit config files.
Disable notifications for this host
. Nagios will still check this host but notifications are not generated anymore. Notifications must by manually enabled after you fix the firewall problem.Acknowledge this host problem
on same place as you would disable notifications (web UI). This allows you to disable notifications and also put some comment/note to this problem. Notifications are automatically enabled when host change his status to UP (green).Disable active checks of this host
option along with Disable notifications for this host
. This disable notification and also Nagios stop with pinging remote host. But do not forget to enable these options after you fix your firewall issue/rules.Upvotes: 3