Reputation: 33
I've developed a small Nagios monitoring scripts which basically runs a tcpdump on a given interface and port, and looks for a particular string in the first 10 captured packets. I'm monitoring a system which may hang and flood my server with a particular message.
I'm not a professional Perl programmer, but I believe I've treated all expections I could.
Running this script locally ends just fine, and returns the console to me. However, when I try to run it via my Nagios server, via ssh (ssh user@host -i private_key '/path/script.pl'), the script is executed sucessfully, I get the exit message, however, ssh does not exit. I have to either Ctrl+C or hit a few returns to get bash back to me. Running it with check_by_ssh yelds me a plugin timeout error, for obvious reasons.
I'm pretty sure it has something to do with the fork() I'm using, but I don't know what is wrong with it.
#!/usr/bin/perl -w
use strict;
use warnings;
use Getopt::Long;
my $RC_OK = 0;
my $RC_WARNING = 1;
my $RC_CRITICAL = 2;
my $RC_UNKNOWN = 3;
my $GREP_RC = undef;
my $PORT = undef;
my $INT = undef;
my $STRING = undef;
my $PID = undef;
# Handler principal de alarme de timeout
$SIG{ALRM} = sub {
print "UNKNOWN: Main script timed out!\n";
exit $RC_UNKNOWN;
};
# Inicio contagem global
alarm(8);
# Coleta parametros
GetOptions ("port=s" => \$PORT,
"interface=s" => \$INT,
"string=s" => \$STRING);
# Sanity check de parametros
if((not defined $PORT) || (not defined $STRING)) {
print "Usage: ./check_stratus.pl -p=PORT -i=INTERFACE -s=STRING\n";
exit $RC_UNKNOWN;
}
# Capturando pelo tcpdump
defined($PID = fork()) or die "Problema ao criar o fork: $!\n";
if ($PID == 0) {
# Handler secundario de alarme de timeout
$SIG{ALRM} = sub {
exit 1;
};
# Captura no maximo por 5 segundos, ou 10 pacotes
alarm(5);
`sudo /usr/sbin/tcpdump -nX -s 2048 -c 10 -i $INT port $PORT > /tmp/capture.txt 2>&1`;
# Checando se o tcpdump rodou com sucesso
if ($? != 0) {
print "Erro ao executar \"/usr/sbin/tcpdump -nX -s 2048 -c 1 -i $INT port $PORT > /tmp/capture.txt\", verifique o arquivo de saida para mais detalhes.\n";
exit $RC_UNKNOWN;
}
exit $RC_OK;
}
# Espera o filho encerar...
waitpid($PID, 0);
# Verificando se o arquivo capturado esta ok
`/bin/ls /tmp/capture.txt`;
if ($? != 0) {
print "Erro ao encontrar o arquivo /tmp/capture.txt\n";
exit $RC_UNKNOWN;
}
# Executando grep da string em cima da captura
`/bin/grep $STRING /tmp/capture.txt`;
# Verificando resultado do grep
if ($? == 0) {
print "Foi encontrada a string \"$STRING\" na captura do tcpdump escutando na interface $INT e na porta $PORT!\n";
exit $RC_CRITICAL;
}
if ($? == 256) {
print "Nao foi encontrada a string \"$STRING\" na captura do tcpdump escutando na interface $INT e na porta $PORT.\n";
exit $RC_OK;
} else {
print "Erro desconhecido! Codigo do grep foi $?\n";
exit $RC_UNKNOWN;
}
Any help is deeply appreciated.
Thank you!
Upvotes: 3
Views: 1062
Reputation: 8059
look here:
#!/usr/bin/perl
use strict;
my $PID;
defined($PID = fork()) or die "no fork works";
if ($PID == 0) {
# Handler secundario de alarme de timeout
$SIG{ALRM} = sub {
exit 1;
};
# Captura no maximo por 5 segundos, ou 10 pacotes
alarm(1);
`sleep 100`;
}
waitpid($PID, 0);
/tmp$ ps xawww |grep sleep
1705 pts/2 S+ 0:00 grep sleep
host:/tmp$ time /tmp/test.pl
real 0m1.008s
user 0m0.000s
sys 0m0.004s
host:/tmp$ ps xawww |grep sleep
1708 pts/2 S 0:00 sleep 100
1710 pts/2 S+ 0:00 grep sleep
The problem appears because your system fork a new process and that process not get signal from parent process.
Solution is just use exec()
instead of ``
or system()
as exec()
does not fork new process:
alarm(1);
exec("sleep 100");
Upvotes: 2