Reputation: 702
I've been attempting to write a program that can accurately kill a forked process after the child exceeds a certain running time. The process should be spawned from an executable on the disk. I am using linux, here. Let's not even consider the ugliness of simply forking a process on Windows (I feel sorry for whoever has to maintain the Windows source code.)
My current solution makes use of the setrlimit
function within the child process, and then proceeds to call execvp
to overwrite the child with the executable. The setrlimit
function can be used with an argument of RLIMIT_CPU
in order to limit the CPU running time for a process. Essentially, the forked child is limiting it's own running time, and then using execvp
to start the executable.
There is a struct proc_timer that I wrote that stores the required information needed to run the executable:
typedef struct _proc_timer {
double limit;
char** args;
char* name;
pid_t proc;
int argc;
} proc_timer;
This struct is being correctly allocated and initialized via a function called make_timer
. But that isn't the issue.
The current code looks something like this:
int status;
const pid_t proc = fork();
if (proc < 0)
return FORK_FAIL;
else if (proc == 0) {
struct rlimit rlim;
rlim.rlim_cur = rlim.rlim_max = (rlim_t)timer->limit + 1; // An extra second, since rlim_t is in seconds.
if (setrlimit(RLIMIT_CPU, &rlim) < 0)
_exit(TIME_FAILURE);
execvp(timer->name, timer->args);
_exit(NOT_FOUND);
}
timer->proc = proc; // save the PID, probably not useful though.
waitpid(proc, &status, 0); // wait for the child.
// If true, then the termination was caused by an exceeded time limit.
if (WTERMSIG(status) == SIGXCPU) {
fprintf(stderr, "We crossed the time limit!");
}
Note that in the above code, I have to give the child process more than limit
seconds, because struct rlimit
can only take a limit of type rlim_t
, which is an integer. Now if the user wants a floating point number, I still have to provide an integer number of seconds to the child process. That's why I am forced to write:
rlim.rlim_cur = rlim.rlim_max = (rlim_t)timer->limit + 1; // An extra second, since rlim_t is in seconds.
which is obviously an overapproximation, but I haven't considered another viable option.
The following solution wouldn't work anyway, since ualarm measures actual time, and not CPU time, which I didn't know when I wrote this question. I was only interested in CPU time.
I have found a potential solution, however. It involves using the ualarm
system call before the call to waitpid
. Since ualarm
takes an argument in microseconds, it is much more accurate than the setrlimit
solution. Then, immediately after waitpid
, I would disable the alarm with a call to ualarm(0, 0)
. That way, if the parent was still waiting for the child and the time limit was crossed, the parent would be sent SIGALRM
, and I could handle it. This solution, however, has so many problems I would not ever consider using it. The only way to handle a signal is by making a function to act as the handler. This can be done using a sigaction
or simply a call to the signal
function to install the handler. However, the handler can only be a function that takes one argument, and that is the integer number of the signal sent.There is no way for the handler to know which process to kill!
The only way to get around this would be to set a global variable to store the pid, but this can cause problems, since this code is to be used as background code for a GUI application, in which the user should theoretically be able to perform this process multiple times at once. Storing multiple global variables for all the different processes would make such a solution impossible to work.
So, the optimal solution here is to use the setrlimit
function in the child. However, I cannot accurately use floating point numbers here! Is there another, more accurate solution?
Upvotes: 2
Views: 3336
Reputation: 70931
You could fork off a launcher&killer (l&k) process for each process to be launched and killed.
This l&k process forks-off the actual process to be killed later, stores its pid globally (that is local to this l&k process) and then sets up as an alarm handler (using alarm()
) to kill the actual process it forked-off beforehand.
To avoid a race the l&k process also needs to setup a handler to detect whether the child terminated before the alarm signal was received.
Upvotes: 1
Reputation: 140629
Ignore what I said in comments: I believe you are looking for setitimer
with ITIMER_VIRTUAL
. You would call this in the child before execve
. It can trigger a fatal signal after a certain amount of CPU time elapsed, with a resolution in microseconds, and (unlike timer_create
and ualarm
) is documented to survive execve
. Note however that nothing stops the process from clearing the timer itself.
Upvotes: 4