David W
David W

Reputation: 565

Nagios Monitoring Hosts with check_ping

I've deployed a new instance of Nagios on a fresh install of CentOS 7 via the EPEL repository. So the Nagios Core version is 3.5.1.

After installing nagios and nagios-plugins-all (via yum), I've created a number of hosts and service definitions, have tested my configuration with nagios -v /etc/nagios/nagios.cfg, and have Nagios up and running!

Unfortunately, my host checks are failing (although my service checks are working perfectly fine).

Within the Nagios Web GUI / Dashboard, if I drill down into a Host page with the "Host State Information", I see this being reported for "Status Information" (IP address removed):

Status Information: /usr/bin/ping -n -U -w 30 -c 5 {my-host-ip-address}

CRITICAL - Could not interpret output from ping command

enter image description here

So in my troubleshooting, I drilled down into the Nagios Plugins directory (/usr/lib64/nagios/plugins), and ran a test with the check_ping plugin consistent with the way check-host-alive runs the command (see below for my check-host-alive command definition):

./check_ping -H {my-ip-address} -w 3000.0,80% -c 5000.0,100% -p 5

This check_ping command returns the following output:

PING OK - Packet loss = 0%, RTA = 0.63 ms|rta=0.627000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

I haven't changed the definition of how check_ping works, and can confirm that I'm getting a "PING OK" whenever the command is run the same way that check-host-alive runs the command, so I cannot figure out what's going on!

Below are the command definitions for check-host-alive as well as check_ping.

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }

{snip}

# 'check_ping' command definition
define command{
        command_name    check_ping
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
        }

Any suggestions on how I can fix my check-host-alive command definition to work properly and evaluate the output of check_ping properly?

Edit

Below is the full define host {} template I'm using:

define host     {
        host_name                       myers    ; The name of this host template
        alias                           Myers
        address                         [redacted]
        check_command                   check-host-alive
        contact_groups                  admins
        notifications_enabled           0               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        failure_prediction_enabled      1               ; Failure prediction is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        1
        max_check_attempts              2
        }

Upvotes: 2

Views: 61602

Answers (4)

David W
David W

Reputation: 565

I was fairly certain that running chmod U+s /usr/bin/ping would solve the issue, but I was (and still am) wary about chmod'ing system files. It seems to me that there has to be a safer way to do it.

However, in the end, that's what I did - and it works. I don't like it, from a security standpoint.

Upvotes: 2

Hasitha
Hasitha

Reputation: 795

I also had same problem and the above answers did not work for me. After some checking the issue further noticed that the reason is IP protocol. once I passed the correct IP protocol , It worked fine.

/usr/local/nagios/libexec/check_ping -H localhost -w 3000.0,80% -c 5000.0,100% -4

output

PING OK - Packet loss = 0%, RTA = 0.05 ms|rta=0.051000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

By default It's getting IPv6.

/usr/local/nagios/libexec/check_ping -H localhost -w 3000.0,80% -c 5000.0,100% -6

output

/sbin/ping6 -n -U -W 30 -c 5 localhost
CRITICAL - Could not interpret output from ping command

But when integrating with Nagios server, I could not able to pass this value as an argument. Therefore I have done below workaround in client side nrpe.cfg file

command[check_ping_args]=/usr/local/nagios/libexec/check_ping -H $ARG1$ -w $ARG2$ -c $ARG3$ -4

Here Host, warning and critical thresholds were passing by Nagios host as below,

define service{
    use                             generic-service        
    hostgroup_name                  all-servers
    service_description             Host Ping Status
    check_command                   check_nrpe_args!check_ping_args!localhost!3000.0,80%!5000.0,100%

}

Upvotes: 0

Jeffrey Tackett
Jeffrey Tackett

Reputation: 127

For anyone else who runs into this issue, there's another option than changing permissions on ping. Simply change the host check command to use check_host rather than check_ping. While there are certainly some differences in the functionality, the overall end result is the same.

There are those who will say this isn't a good option because of the ability to range the check_ping command, but it should be remembered that host checks aren't even executed until all service checks for a given host have failed. Anyway, if you're interested in testing throughput, there are MUCH better ways of going about it than relying on ICMP, which is the lowest priority traffic type on a network.

I'm sure the OP is well on to other things by now, but hopefully someone else who has this issue will benefit.

Upvotes: 5

Lakshmikandan
Lakshmikandan

Reputation: 4647

I could not found the ping on /usr/bin/ping

# chmod u+s /bin/ping 

# ls -al /bin/ping 
-rwsr-xr-x 1 root root 40760 Sep 26  2013 /bin/ping*

Finally run the below command,

 /usr/local/nagios/libexec/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60% -p 5

Upvotes: 3

Related Questions