Reputation: 3155
I have a Python script which is running bash scripts. I need to be able to kill the bash script if it seems to be infinite and it also has to be run in chroot jail because the script might be dangerous. I run it with psutil.Popen()
and leave it running for two seconds. If it does not end naturally, I send SIGKILL
to it and all of its possible children.
The problem is that if I kill one script due to overtime execution and run another one, the main (Python) script receives a SIGSTOP
. On my local machine, I made a really stupid solution: the Python script wrote its PID to a file at startup and then I run another script, which was sending SIGCONT
every second to the PID which was stored in the file. This has two problems: it is really stupid, but even worse is that it refuses to work on the server - SIGCONT
just does nothing there.
The sequence is: Python script runs a bash script responsive for the jail and that bash script runs the possibly dangerous and/or infinite script. This script might have some children as well.
The relevant parts of the codes:
Main python script
p = psutil.Popen(["bash", mode, script_path, self.TESTENV_ROOT])
start = time.time()
while True:
if p.status() == psutil.STATUS_ZOMBIE:
# process ended naturally
duration = time.time() - start
self.stdout.write("Script finished, execution time: {}s".format(duration))
break
if time.time() > start + run_limit:
children = p.children(recursive=True)
for child in children:
child.kill()
p.kill()
duration = None
self.stdout.write("Script exceeded maximum time ({}s) and was killed.".format(run_limit))
break
time.sleep(0.01)
os.kill(os.getpid(), 17) # SIGCHLD
return duration
Running script in chroot ($1 is the script to be run in the chroot jail, $2 is the jail path)
#!/usr/bin/env bash
# copy script to chroot environment
cp "$1" "$2/prepare.sh"
# run script
chmod u+x "$2/prepare.sh"
echo './prepare.sh' | chroot "$2"
rm "$2/prepare.sh"
Example prepare.sh script
#!/bin/bash
echo asdf > file
I spent some time trying to solve the issue. I found out that this script (which is not using chroot jail to run bash scripts) is working perfectly:
import psutil
import os
import time
while True:
if os.path.exists("infinite.sh"):
p = psutil.Popen(["bash","infinite.sh"])
start = time.time()
while True:
if p.status() == psutil.STATUS_ZOMBIE:
# process ended naturally
break
if time.time() > start + 2:
# process needs too much time and has to be killed
children = p.children(recursive=True)
for child in children:
child.kill()
p.kill()
break
os.remove("infinite.sh")
os.kill(os.getpid(), 17)
My questions are:
SIGSTOP
s? Is it due to the chroot jail?Thanks for your ideas.
EDIT: I found out that I am sigstopped at the moment I run the first script after I killed an overtime one. No matter if I use os.system
or psutil.Popen
.
EDIT2: I did even more investigation and the critical line is echo './prepare.sh' | chroot "$2"
in the bash script controlling the chroot jail. The question now is, what the hell is wrong with it?
EDIT3: This might be a related problem, if it helps someone.
Upvotes: 0
Views: 2359
Reputation: 301
This thread is a little bit older but I believe I know the cause of your problem (had a similar issue):
From here it says:
Linux supports the standard signals listed below. [...] First the signals described in the original POSIX.1-1990 standard.
Signal Value Action Comment ────────────────────────────────────────────────────────────────────── SIGHUP 1 Term Hangup detected on controlling terminal or death of controlling process SIGINT 2 Term Interrupt from keyboard SIGQUIT 3 Core Quit from keyboard SIGILL 4 Core Illegal Instruction SIGABRT 6 Core Abort signal from abort(3) SIGFPE 8 Core Floating-point exception SIGKILL 9 Term Kill signal SIGSEGV 11 Core Invalid memory reference SIGPIPE 13 Term Broken pipe: write to pipe with no readers; see pipe(7) SIGALRM 14 Term Timer signal from alarm(2) SIGTERM 15 Term Termination signal SIGUSR1 30,10,16 Term User-defined signal 1 SIGUSR2 31,12,17 Term User-defined signal 2 SIGCHLD 20,17,18 Ign Child stopped or terminated SIGCONT 19,18,25 Cont Continue if stopped SIGSTOP 17,19,23 Stop Stop process SIGTSTP 18,20,24 Stop Stop typed at terminal SIGTTIN 21,21,26 Stop Terminal input for background process SIGTTOU 22,22,27 Stop Terminal output for background process
It shows, that a process (per default action) also gets stopped when it receives the SIGTSTP, SIGTTIN, or SIGTTOU signals.
This page explains that:
[SIGTTIN and SIGTTOU] are signals that are sent to background processes that they attempt to read from (SIGTTIN) or write to (SIGTTOU) their controlling terminal (or tty).
...
[...] changing terminal settings [from a background process] does cause SIGTTOU to be sent
I used sudo strace -tt -o [trace_output_file] -p [pid]
to see which signal triggered the stopping of my process.
How to solve the Problem? I sadly cannot get your reduced example to work: How does your infinite.sh looks like? Why are you removing it during execution? I suggest redirecting stdin and stdout. Have you tried the following?
from subprocess import DEVNULL
p = psutil.Popen(["bash", mode, script_path, self.TESTENV_ROOT],
stdout=DEVNULL, stderr=DEVNULL, STDIN=DEVNULL)
You can of course also use subprocess.PIPE to handle the output in your Python code or simply redirect to a file. I am not sure how to handle unauthorized attempts to modify the tty settings.
Upvotes: 1
Reputation: 3155
Ok, I finally found the solution. The problem really was on the chroot line in the bash script:
echo './prepare.sh' | chroot "$2"
This appears to be incorrect for some reason. The correct way to run a command in chroot is:
chroot chroot_path shell -c command
So for example:
chroot '/home/chroot_jail' '/bin/sh' -c 'rm -rf /'
Hope this helps someone.
Upvotes: 1
Reputation: 11316
I'm pretty sure you're running this on Mac OS and not Linux. Why? You're sending signal 17
to your main python process instead of using:
import signal
signal.SIGCHLD
I believe you have a handler for signal 17
which is supposed to respawn the jailed process in response to this signal.
But signal.SIGCHLD == 17
on Linux and signal.SIGCHLD == 20
on Mac OS.
Now the answer for your question is:
signal.SIGSTOP == 17
on Mac OS.
Yes, your process sends SIGSTOP
to itself with os.kill(os.getpid(), 17)
Mac OS signal man page
EDIT:
Actually it can also happen on Linux since Linux signal man page says that POSIX standard allows signal 17
to be either SIGUSR2
, SIGCHLD
or SIGSTOP
. Therefore I strongly recommend using constants from signal
module of the standard library instead of hardcoded signal numbers.
Upvotes: 2