Reputation: 155
#!/bin/bash
date=$(date +"%Y-%m-%d %H:%M:%S")
getpid=$(pgrep nginx | wc -l
if [ "$getpid" > 0 ]
then
echo 'Nginx is Fine, It is Running at' $date
else
echo "Error on Nginx and stoped at" $date
sudo fuser -k 443/tcp
sudo service nginx start
sudo service monit start
sudo monit monitor all
echo "Error on Nginx and stoped at" $date | mailx -s "The Nginx Stop - But it is Fixed" -A /root/nginx_log3.txt mymail.com
fi
exit 0;
I see several ways to check the service:
$(ps ax | grep myName | fgrep -v grep | awk '{ print $1 }')
or
if (( $(ps -ef | grep -v grep | grep $service | wc -l) > 0 ))
and several ways...
innumerable means of checking if a process is running, but the problem, is that even though I terminate nginx
, any command using PS always brings many results with pid
values! since pgrep
always shows 0 after I do service nginx stop, and the gprep
seems perfect to compare with value 0 which is less than any pid,
because if it has run, always show pid above velor 0, pgrep
is the path , but the problem is that I have a crontab
that runs this script every 3 minutes, and even with the process running, with a pid greater than zero, it is restarting the service!
I found several blogs teaching several scripts for this, but none works!
I have the monit
to keep the services alive, but sometimes it fails.
clearly I am not sure how to compare the values, of the extracted pid with the value 0:
getpid=$(pgrep nginx | wc -l)
if [ "$getpid" > 0 ]
because the service is always restarting every 3 minutes, even the service running with a pid running (value greater than 0)
I really appreciate your help!
Upvotes: 0
Views: 3404
Reputation: 7327
I would maybe check the www services that you are trying to serve instead of if the service is running. Take a look at these various examples :
https=$(nc -z localhost 443)
http=$(nc -z localhost 80)
netstt_cnt=$(netstat -ntlp | grep httpd | wc -l)
http_issues=""
if [[ ! $https ]] || [[ ! $http ]] ;then
http_issues=" -Http/https ports not detected "
fi
if [ $netstt_cnt -ne 2 ] ;then
http_issues="${http_issues} -Netstat not reporting httpd "
fi
# -- if http_host_check is set perform httpd checks
local code_stat=""
if [[ $http_host_check ]] ;then
http_code=$(curl --write-out %{http_code} --silent --output /dev/null $http_host_check)
if [ $http_code -lt 1 ] ;then
http_issues="${http_issues} -Apache NOT serving pages http_code=$http_code. "
elif [ $http_code -gt 399 ] ;then
http_issues="${http_issues} -Apache Error http_code=$http_code on test page ${http_host_check}. "
fi
code_stat=", (http_code=${http_code}) "
fi
# -- php FPM sock, see readonly var $PHP_FPM_SOCK for use set to "" to disable this check.
if [[ $PHP_FPM_SOCK ]] ;then
if ! echo /dev/null | socat UNIX:${PHP_FPM_SOCK} - ;then
http_issues="${http_issues} -php-fpm sock not communicating"
fi
fi
if [[ $http_issues ]] ;then
echo "Error on Nginx and stoped at" $date
sudo service nginx stop
sudo service monit stop
sudo fuser -k 443/tcp
sleep 10
sudo service nginx start
sudo service monit start
sudo monit monitor all
if [[ $http_host_check ]] ;then
http_code=$(curl --write-out %{http_code} --silent --output /dev/null $http_host_check)
if [ $http_code -lt 1 ] ;then
http_issues="${http_issues} -Apache NOT serving pages http_code=$http_code. "
elif [ $http_code -gt 399 ] ;then
http_issues="${http_issues} -Apache Error http_code=$http_code on test page ${http_host_check}. "
fi
sleep 5
code_stat=", (http_code=${http_code}) "
echo "Webserver had a problem, current status is $code_stat" $date | mailx -s "The Nginx stop: $code_stat" -A /root/nginx_log3.txt mymail.com
fi
echo "Current Status : $http_issues"
Update: super simple example added here :
http_code=$(curl --write-out %{http_code} --silent --output /dev/null http://my_domain.com/)
if [ $http_code -lt 1 ] ;then
echo "${http_issues} -Apache NOT serving pages http_code=$http_code. "
// ... do something here (restart web server)..
fi
Upvotes: 2