fulv
fulv

Reputation: 1016

Nagios: CRITICAL - Socket timeout after 10 seconds

I've been running nagios for about two years, but recently this problem started appearing with one of my services.

I'm getting

CRITICAL - Socket timeout after 10 seconds

for a check_http -H my.host.com -f follow -u /abc/def check, which used to work fine. No other services are reporting this problem. The remote site is up and healthy, and I can do a wget http://my.host.com/abc/def from the nagios server, and it downloads the response just fine. Also, doing a check_http -H my.host.com -f follow works just fine, i.e. it's only when I use the -u argument that things break. I also tried passing it a different user agent string, no difference. I tried increasing the timeout, no luck. I tried with -v, but all it get is:

GET /abc/def HTTP/1.0
User-Agent: check_http/v1861 (nagios-plugins 1.4.11)
Connection: close
Host: my.host.com


CRITICAL - Socket timeout after 10 seconds

... which does not tell me what's going wrong.

Any ideas how I could resolve this?

Thanks!

Upvotes: 7

Views: 63401

Answers (5)

Duven Duven
Duven Duven

Reputation: 1

In my case /etc/postfix/main.cf file was not good configured. My mailserverrelay was not defined and was also very restrictive. I should to add:

relayhost = mailrelay.ext.example.com

smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination

Upvotes: 0

ElementalStorm
ElementalStorm

Reputation: 828

For whoever is interested, I stumbled in this problem too and the problem ended up being in mod_itk on the web server.

A patch is available, even if it seems it's not included in the current CentOS or Debian packages:

https://lists.err.no/pipermail/mpm-itk/2015-September/000925.html

Upvotes: 0

Fixed with this url in nrpe.cfg: (on Deb 6.0 Squeeze using nagios-nrpe-server)

command[check_http]=/usr/lib/nagios/plugins/check_http -H localhost -p 8080 -N -u /login?from=%2F

Upvotes: 0

sweetfa
sweetfa

Reputation: 5845

I tracked my issue down to an issue with the security providers configured in the most recent version of OpenSUSE.

From summary of other web pages it appears to be an issue with an attempt to use TLSv2 protocol which does not appear to work correctly, or is missing something in the default configurations to allow it to work.

To overcome the problem I commented out the security provider in question from the JRE security configuration file.

#security.provider.10=sun.security.pkcs11.SunPKCS11

The security.provider. value may be different in your configuration, but essentially the SunPKCS11 provider is at issue.

This configuration is normally found in

$JAVA_HOME/lib/security/java.security

of the JRE that you are using.

Upvotes: 1

rwf
rwf

Reputation: 186

Try using the -N option of check_http.

I ran into similar problems, and in my case the web server didn't terminate the connection after sending the response (https was working, http wasn't). check_http tries to read from the open socket until the server closes the connection. If that doesn't happen then the timeout occurs.

The -N option tells check_http to receive only the header, but not the content of the page / document.

Upvotes: 17

Related Questions