user3373692
user3373692

Reputation: 51

simulate nagios notifications

My normal method of testing the notification and escalation chain is to simulate a failure by causing one, for example blocking a port.

But this is thoroughly unsatisfying. I don't want down time recorded in nagios where there was none. I also don't want to wait.

Does anyone know a way to test a notification chain without causing the outage? For example something like this:

$ ./check_notifications_chain <service|host> <time down>
at <x> minutes notification email sent to group <people>
at <2x> minutes notification email sent to group <people>
at <3x> minutes escalated to group <management>
at <200x> rm -rf; shutdown -h now executed.

Extending this paradigm I might make the notification chain a nagios check in itself, but I'll stop here before my brain explodes.

Anyone?

Upvotes: 5

Views: 15764

Answers (2)

Anup
Anup

Reputation: 31

This is an old post but maybe my solution can help someone.

I use the plugin "check_dummy" which is in the Nagios plugins pack. As it says, it is stupid.

See some exemple of how it works :

Usage:
 check_dummy <integer state> [optional text]
$ ./check_dummy 0
OK
$ ./check_dummy 2
CRITICAL
$ ./check_dummy 3 salut
UNKNOWN: salut
$ ./check_dummy 1 azerty
WARNING: azerty
$ echo $?
1

I create a file which contain the interger state and the optional text : echo 0 OKAY | sudo tee /usr/local/nagios/libexec/dummy.txt sudo chown nagios:nagios /usr/local/nagios/libexec/dummy.txt

With the command :

# Dummy check (notifications tests)
define command {
    command_name    my_check_dummy
    command_line    $USER1$/check_dummy $(cat /usr/local/nagios/libexec/dummy.txt)
}

Associated with the service description :

define service {
    use                             generic-service
    host_name                       localhost
    service_description             Dummy check
    check_period                    24x7
    check_interval                  1
    max_check_attempts              1
    retry_interval                  1
    notifications_enabled           1
    notification_options            w,u,c,r
    notification_interval           0
    notification_period             24x7
    check_command                   my_check_dummy
}

So I just change the contents of the file "dummy.txt" to change the service state :

echo "2 Oups" | sudo tee /usr/local/nagios/libexec/dummy.txt
echo "1 AHHHH" | sudo tee /usr/local/nagios/libexec/dummy.txt
echo "0 Parfait !" | sudo tee /usr/local/nagios/libexec/dummy.txt

This allowed me to debug my notification program.

Hope it helps !

Upvotes: 3

Tyler Henthorn
Tyler Henthorn

Reputation: 581

If you only want to verify that the email alerts are working properly, you could create a simple test service, which generates a warning once a day.

test_alert.sh:

#!/bin/bash

date=`date -u +%H%M`

echo $date
echo "Nagios test script. Intentionally generates a warning daily."

if [[ "$date" -ge "1900" && "$date" -le "1920" ]] ; then
  exit 1
else
  exit 0
fi

commands.cfg:

define command{
  command_name  test_alert
  command_line  /bin/bash /usr/local/scripts/test_alert.sh
}

services.cfg:

define service {
  host                  localhost
  service_description   Test Alert
  check_command         test_alert
  use                   generic-service
}

Upvotes: 6

Related Questions