Reputation: 685
I have a process that fails regularly & sometimes starts duplicate instances..
When I run:
ps x |grep -v grep |grep -c "processname"
I will get:
2
This is normal as the process runs with a recovery process..
If I get
0
I will want to start the process
if I get:
4
I will want to stop & restart the process
What I need is a way of taking the result of ps x |grep -v grep |grep -c "processname"
Then setup a simple 3 option function
ps x |grep -v grep |grep -c "processname"
if answer = 0 (start process & write NOK & Time to log /var/processlog/check)
if answer = 2 (Do nothing & write OK & time to log /var/processlog/check)
if answer = 4 (stot & restart the process & write NOK & Time to log /var/processlog/check)
The process is stopped with
killall -9 process
The process is started with
process -b -c /usr/local/etc
My main problem is finding a way to act on the result of ps x |grep -v grep |grep -c "processname"
.
Ideally, I would like to make the result of that grep a variable within the script with something like this:
process=$(ps x |grep -v grep |grep -c "processname")
If possible.
Upvotes: 43
Views: 136175
Reputation: 41456
Programs to monitor if a process on a system is running.
Script is stored in crontab
and runs once every minute.
#! /bin/bash
case "$(pidof amadeus.x86 | wc -l)" in
0) echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
;;
1) # all ok
;;
*) echo "Removed double Amadeus: $(date)" >> /var/log/amadeus.txt
kill $(pidof amadeus.x86 | awk '{print $1}')
;;
esac
0
If process is not found, restart it.
1
If process is found, all ok.
*
If process running 2 or more, kill the last.
It just tests the exit flag $?
from the pidof
program. It will be 0
of process is running and 1
if not.
#!/bin/bash
pidof amadeus.x86 >/dev/null
if [[ $? -ne 0 ]] ; then
echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
fi
pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &
This can then be used in crontab to run every minute like this:
* * * * * pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &
cccam oscam
Upvotes: 82
Reputation: 1626
If you are using CentOS, no need to write a script and set cron job. Here is one of the smartest ways to ensure systemd services restart on failure. Make following changes to /usr/lib/systemd/system/mariadb.service
Then under the [Service] section in the file, add the following 2 lines:
Restart=always
RestartSec=3
After saving the file we need to reload the daemon configurations to ensure systemd is aware of the new file
systemctl daemon-reload
Read the following link for the complete steps - https://jonarcher.info/2015/08/ensure-systemd-services-restart-on-failure/
Upvotes: 0
Reputation: 481
In case you're looking for a more modern way to check to see if a service is running (this will not work for just any old process), then systemctl might be what you're looking for.
Here's the basic command:
systemctl show --property=ActiveState your_service_here
Which will yield very simple output (one of the following two lines will appear depending on whether the service is running or not running):
ActiveState=active
ActiveState=inactive
And if you'd like to know all of the properties you can get:
systemctl show --all your_service_here
If you prefer that alphabetized:
systemctl show --all your_service_here | sort
And the full code to act on it:
service=$1
result=`systemctl show --property=ActiveState $service`
if [[ "$result" == 'ActiveState=active' ]]; then
echo "$service is running" # Do something here
else
echo "$service is not running" # Do something else here
fi
Upvotes: 0
Reputation: 79
I cannot get case to work at all. Heres what I have:
#! /bin/bash
logfile="/home/name/public_html/cgi-bin/check.log"
case "$(pidof -x script.pl | wc -w)" in
0) echo "script not running, Restarting script: $(date)" >> $logfile
# ./restart-script.sh
;;
1) echo "script Running: $(date)" >> $logfile
;;
*) echo "Removed duplicate instances of script: $(date)" >> $logfile
# kill $(pidof -x ./script.pl | awk '{ $1=""; print $0}')
;;
esac
rem the case action commands for now just to test the script. the above pidof -x command is returning '1', the case statement is returning the results for '0'.
Anyone have any idea where I'm going wrong?
Solved it by adding the following to my BIN/BASH Script: PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Upvotes: 0
Reputation: 79
The 'pidof' command will not display pids of shell/perl/python scripts. So to find the process id’s of my Perl script I had to use the -x option i.e. 'pidof -x perlscriptname'
Upvotes: 0
Reputation: 1
If you changed awk '{print $1}' to '{ $1=""; print $0}' you will get all processes except for the first as a result. It will start with the field separator (a space generally) but I don't recall killall caring. So:
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{ $1=""; print $0}')
;;
esac
It is worth noting that the pidof route seems to work fine for commands that have no spaces, but you would probably want to go back to a ps-based string if you were looking for, say, a python script named myscript that showed up under ps like
root 22415 54.0 0.4 89116 79076 pts/1 S 16:40 0:00 /usr/bin/python /usr/bin/myscript
Just an FYI
Upvotes: 0
Reputation: 187
I adopted the @Jotne solution and works perfectly! For example for mongodb server in my NAS
#! /bin/bash
case "$(pidof mongod | wc -w)" in
0) echo "Restarting mongod:"
mongod --config mongodb.conf
;;
1) echo "mongod already running"
;;
esac
Upvotes: 8
Reputation: 685
I have adopted your script for my situation Jotne.
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{print $1}')
;;
esac
While I was testing, I ran into a problem..
I started 3 extra process's of oscam1 with this line:
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
which left me with 8 process for oscam1. the problem is this..
When I run the script, It only kills 2 process's at a time, so I would have to run it 3 times to get it down to 2 process..
Other than killall -9 oscam1
followed by /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
, in *)
is there any better way to killall apart from the original process? So there would be zero downtime?
Upvotes: 5