How to accurately kill child process after certain time limit in C?

Question

I've been attempting to write a program that can accurately kill a forked process after the child exceeds a certain running time. The process should be spawned from an executable on the disk. I am using linux, here. Let's not even consider the ugliness of simply forking a process on Windows (I feel sorry for whoever has to maintain the Windows source code.)

My current solution makes use of the setrlimit function within the child process, and then proceeds to call execvp to overwrite the child with the executable. The setrlimit function can be used with an argument of RLIMIT_CPU in order to limit the CPU running time for a process. Essentially, the forked child is limiting it's own running time, and then using execvp to start the executable.

There is a struct proc_timer that I wrote that stores the required information needed to run the executable:

typedef struct _proc_timer {
    double limit;
    char** args;
    char* name;
    pid_t proc;
    int argc;
} proc_timer;

This struct is being correctly allocated and initialized via a function called make_timer. But that isn't the issue.

The current code looks something like this:

int status;
const pid_t proc = fork();
if (proc < 0)
    return FORK_FAIL;

else if (proc == 0) {
    struct rlimit rlim;
    rlim.rlim_cur = rlim.rlim_max = (rlim_t)timer->limit + 1; // An extra second, since rlim_t is in seconds.
    if (setrlimit(RLIMIT_CPU, &rlim) < 0)
        _exit(TIME_FAILURE);

     execvp(timer->name, timer->args);
    _exit(NOT_FOUND);
}

timer->proc = proc; // save the PID, probably not useful though.
waitpid(proc, &status, 0); // wait for the child.

// If true, then the termination was caused by an exceeded time limit.
if (WTERMSIG(status) == SIGXCPU) {
   fprintf(stderr, "We crossed the time limit!");
}

Note that in the above code, I have to give the child process more than limit seconds, because struct rlimit can only take a limit of type rlim_t, which is an integer. Now if the user wants a floating point number, I still have to provide an integer number of seconds to the child process. That's why I am forced to write:

rlim.rlim_cur = rlim.rlim_max = (rlim_t)timer->limit + 1; // An extra second, since rlim_t is in seconds.

which is obviously an overapproximation, but I haven't considered another viable option.

The following solution wouldn't work anyway, since ualarm measures actual time, and not CPU time, which I didn't know when I wrote this question. I was only interested in CPU time.

I have found a potential solution, however. It involves using the ualarm system call before the call to waitpid. Since ualarm takes an argument in microseconds, it is much more accurate than the setrlimit solution. Then, immediately after waitpid, I would disable the alarm with a call to ualarm(0, 0). That way, if the parent was still waiting for the child and the time limit was crossed, the parent would be sent SIGALRM, and I could handle it. This solution, however, has so many problems I would not ever consider using it. The only way to handle a signal is by making a function to act as the handler. This can be done using a sigaction or simply a call to the signal function to install the handler. However, the handler can only be a function that takes one argument, and that is the integer number of the signal sent.There is no way for the handler to know which process to kill!

The only way to get around this would be to set a global variable to store the pid, but this can cause problems, since this code is to be used as background code for a GUI application, in which the user should theoretically be able to perform this process multiple times at once. Storing multiple global variables for all the different processes would make such a solution impossible to work.

So, the optimal solution here is to use the setrlimit function in the child. However, I cannot accurately use floating point numbers here! Is there another, more accurate solution?

zwol · Accepted Answer

Ignore what I said in comments: I believe you are looking for setitimer with ITIMER_VIRTUAL. You would call this in the child before execve. It can trigger a fatal signal after a certain amount of CPU time elapsed, with a resolution in microseconds, and (unlike timer_create and ualarm) is documented to survive execve. Note however that nothing stops the process from clearing the timer itself.

How to accurately kill child process after certain time limit in C?

Answers (2)

Related Questions