Reputation: 644
I want to write a Python script which will check every minute if some pre-defined process is still running on Linux machine and if it doesn't print a timestamp at what time it has crashed. I have written a script which is doing exactly that but unfortunately, it works correctly with only one process.
This is my code:
import subprocess
import shlex
import time
from datetime import datetime
proc_def = "top"
grep_cmd = "pgrep -a " + proc_def
try:
proc_run = subprocess.check_output(shlex.split(grep_cmd)).decode('utf-8')
proc_run = proc_run.strip().split('\n')
'''
Creating a dictionary with key the PID of the process and value
the command line
'''
proc_dict = dict(zip([i.split(' ', 1)[0] for i in proc_run],
[i.split(' ', 1)[1] for i in proc_run]))
check_run = "ps -o pid= -p "
for key, value in proc_dict.items():
check_run_cmd = check_run + key
try:
# While the output of check_run_cmd isn't empty line do
while subprocess.check_output(
shlex.split(check_run_cmd)
).decode('utf-8').strip():
# This print statement is for debugging purposes only
print("Running")
time.sleep(3)
'''
If the check_run_cmd is returning an error, it shows us the time
and date of the crash as well as the PID and the command line
'''
except subprocess.CalledProcessError as e:
print(f"PID: {key} of command: \"{value}\" stopped
at {datetime.now().strftime('%d-%m-%Y %T')}")
exit(1)
# Check if the proc_def is actually running on the machine
except subprocess.CalledProcessError as e:
print(f"The \"{proc_def}\" command isn't running on this machine")
For example, if there are two top
processes it will show me information about the crash time of only one of these processes and it will exit. I want to stay active as long as there is another process running and exit only if both processes are killed. It should present information when each of the processes has crashed.
It shall also not be limited to two proc only and support multiple processes started with the same proc_def
command.
Upvotes: 1
Views: 363
Reputation: 25980
Have to change the logic a bit, but basically you want an infinite loop alternating a check on all processes - not checking the same one over and over:
import subprocess
import shlex
import time
from datetime import datetime
proc_def = "top"
grep_cmd = "pgrep -a " + proc_def
try:
proc_run = subprocess.check_output(shlex.split(grep_cmd)).decode('utf-8')
proc_run = proc_run.strip().split('\n')
'''
Creating a dictionary with key the PID of the process and value
the command line
'''
proc_dict = dict(zip([i.split(' ', 1)[0] for i in proc_run],
[i.split(' ', 1)[1] for i in proc_run]))
check_run = "ps -o pid= -p "
while proc_dict:
for key, value in proc_dict.items():
check_run_cmd = check_run + key
try:
# While the output of check_run_cmd isn't empty line do
subprocess.check_output(shlex.split(check_run_cmd)).decode('utf-8').strip()
# This print statement is for debugging purposes only
print("Running")
time.sleep(3)
except subprocess.CalledProcessError as e:
print(f"PID: {key} of command: \"{value}\" stopped at {datetime.now().strftime('%d-%m-%Y %T')}")
del proc_dict[key]
break
# Check if the proc_def is actually running on the machine
except subprocess.CalledProcessError as e:
print(f"The \"{proc_def}\" command isn't running on this machine")
This suffers from the same problems in the original code, namely the time resolution is 3 seconds, and if a new process is run during this script, you won't ping it (though this may be desired).
The first problem would be fixed by sleeping for less time, depending on what you need, the second by running the initial lines creating proc_dict
in the while True
.
Upvotes: 1