linuxnoob
linuxnoob

Reputation: 685

Linux Script to check if process is running and act on the result

I have a process that fails regularly & sometimes starts duplicate instances..

When I run: ps x |grep -v grep |grep -c "processname" I will get: 2 This is normal as the process runs with a recovery process..

If I get 0 I will want to start the process if I get: 4 I will want to stop & restart the process

What I need is a way of taking the result of ps x |grep -v grep |grep -c "processname"

Then setup a simple 3 option function

ps x |grep -v grep |grep -c "processname"
if answer = 0 (start process & write NOK & Time to log /var/processlog/check)
if answer = 2 (Do nothing & write OK & time to log /var/processlog/check)
if answer = 4 (stot & restart the process & write NOK & Time to log /var/processlog/check)

The process is stopped with killall -9 process The process is started with process -b -c /usr/local/etc

My main problem is finding a way to act on the result of ps x |grep -v grep |grep -c "processname".

Ideally, I would like to make the result of that grep a variable within the script with something like this:

process=$(ps x |grep -v grep |grep -c "processname")

If possible.

Upvotes: 43

Views: 136175

Answers (8)

Jotne
Jotne

Reputation: 41456

Programs to monitor if a process on a system is running.

Script is stored in crontab and runs once every minute.

This works with if process is not running or process is running multiple times:

#! /bin/bash

case "$(pidof amadeus.x86 | wc -l)" in

0)  echo "Restarting Amadeus:     $(date)" >> /var/log/amadeus.txt
    /etc/amadeus/amadeus.x86 &
    ;;
1)  # all ok
    ;;
*)  echo "Removed double Amadeus: $(date)" >> /var/log/amadeus.txt
    kill $(pidof amadeus.x86 | awk '{print $1}')
    ;;
esac

0 If process is not found, restart it.
1 If process is found, all ok.
* If process running 2 or more, kill the last.


A simpler version. This just test if process is running, and if not restart it.

It just tests the exit flag $? from the pidof program. It will be 0 of process is running and 1 if not.

#!/bin/bash
pidof  amadeus.x86 >/dev/null
if [[ $? -ne 0 ]] ; then
        echo "Restarting Amadeus:     $(date)" >> /var/log/amadeus.txt
        /etc/amadeus/amadeus.x86 &
fi

And at last, a one liner

pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus:     $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &

This can then be used in crontab to run every minute like this:

* * * * * pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus:     $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &

cccam oscam

Upvotes: 82

Saurabh
Saurabh

Reputation: 1626

If you are using CentOS, no need to write a script and set cron job. Here is one of the smartest ways to ensure systemd services restart on failure. Make following changes to /usr/lib/systemd/system/mariadb.service

Then under the [Service] section in the file, add the following 2 lines:

Restart=always
RestartSec=3

After saving the file we need to reload the daemon configurations to ensure systemd is aware of the new file

systemctl daemon-reload

Read the following link for the complete steps - https://jonarcher.info/2015/08/ensure-systemd-services-restart-on-failure/

Upvotes: 0

John T.
John T.

Reputation: 481

In case you're looking for a more modern way to check to see if a service is running (this will not work for just any old process), then systemctl might be what you're looking for.

Here's the basic command:

systemctl show --property=ActiveState your_service_here

Which will yield very simple output (one of the following two lines will appear depending on whether the service is running or not running):

ActiveState=active
ActiveState=inactive

And if you'd like to know all of the properties you can get:

systemctl show --all your_service_here

If you prefer that alphabetized:

systemctl show --all your_service_here | sort

And the full code to act on it:

service=$1
result=`systemctl show --property=ActiveState $service`
if [[ "$result" == 'ActiveState=active' ]]; then
    echo "$service is running" # Do something here
else
    echo "$service is not running" # Do something else here
fi 

Upvotes: 0

Taffman
Taffman

Reputation: 79

I cannot get case to work at all. Heres what I have:

#! /bin/bash

logfile="/home/name/public_html/cgi-bin/check.log"

case "$(pidof -x script.pl | wc -w)" in

0)  echo "script not running, Restarting script:     $(date)" >> $logfile
#  ./restart-script.sh
;;
1)  echo "script Running:     $(date)" >> $logfile
;;
*)  echo "Removed duplicate instances of script: $(date)" >> $logfile
 #   kill $(pidof -x ./script.pl | awk '{ $1=""; print $0}')
;;
esac

rem the case action commands for now just to test the script. the above pidof -x command is returning '1', the case statement is returning the results for '0'.

Anyone have any idea where I'm going wrong?

Solved it by adding the following to my BIN/BASH Script: PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Upvotes: 0

Taffman
Taffman

Reputation: 79

The 'pidof' command will not display pids of shell/perl/python scripts. So to find the process id’s of my Perl script I had to use the -x option i.e. 'pidof -x perlscriptname'

Upvotes: 0

Kris Long
Kris Long

Reputation: 1

If you changed awk '{print $1}' to '{ $1=""; print $0}' you will get all processes except for the first as a result. It will start with the field separator (a space generally) but I don't recall killall caring. So:

#! /bin/bash

logfile="/var/oscamlog/oscam1check.log"

case "$(pidof oscam1 | wc -w)" in

0)  echo "oscam1 not running, restarting oscam1:     $(date)" >> $logfile
    /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
    ;;
2)  echo "oscam1 running, all OK:     $(date)" >> $logfile
    ;;
*)  echo "multiple instances of oscam1 running. Stopping & restarting oscam1:     $(date)" >> $logfile
    kill $(pidof oscam1 | awk '{ $1=""; print $0}')
    ;;
esac

It is worth noting that the pidof route seems to work fine for commands that have no spaces, but you would probably want to go back to a ps-based string if you were looking for, say, a python script named myscript that showed up under ps like

root 22415 54.0 0.4 89116 79076 pts/1 S 16:40 0:00 /usr/bin/python /usr/bin/myscript

Just an FYI

Upvotes: 0

Tirias
Tirias

Reputation: 187

I adopted the @Jotne solution and works perfectly! For example for mongodb server in my NAS

#! /bin/bash

case "$(pidof mongod | wc -w)" in

0)  echo "Restarting mongod:"
    mongod --config mongodb.conf
    ;;
1)  echo "mongod already running"
    ;;
esac

Upvotes: 8

linuxnoob
linuxnoob

Reputation: 685

I have adopted your script for my situation Jotne.

#! /bin/bash

logfile="/var/oscamlog/oscam1check.log"

case "$(pidof oscam1 | wc -w)" in

0)  echo "oscam1 not running, restarting oscam1:     $(date)" >> $logfile
    /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
    ;;
2)  echo "oscam1 running, all OK:     $(date)" >> $logfile
    ;;
*)  echo "multiple instances of oscam1 running. Stopping & restarting oscam1:     $(date)" >> $logfile
    kill $(pidof oscam1 | awk '{print $1}')
    ;;
esac

While I was testing, I ran into a problem.. I started 3 extra process's of oscam1 with this line: /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 which left me with 8 process for oscam1. the problem is this.. When I run the script, It only kills 2 process's at a time, so I would have to run it 3 times to get it down to 2 process..

Other than killall -9 oscam1 followed by /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1, in *)is there any better way to killall apart from the original process? So there would be zero downtime?

Upvotes: 5

Related Questions