Saf
Saf

Reputation: 517

Race condition in C Signal handler

I'm doing some coursework, and we've been presented with the following code. Some of the questions ask what various lines of the code do, which is fine, and I understand, but the curveball is "This program contains a race condition. Where and why does it arise?"

The code:

#include <stdio.h>
#include <signal.h>

static void handler(int signo) { 
    printf("This is the SIGUSR1 signal handler!\n");
}

int main(void) { 
    sigset_t set;
    sigemptyset(&set);
    sigset(SIGUSR1, handler);
    sigprocmask(SIG_SETMASK, &set, NULL);

    while(1) {
        printf("This is main()!\n");
    }
    return 0;

}

I'm thinking along the lines that the race condition is that there is no way to know what order "This is main" or "This is the SIGUSR1" will be printed when the signal arrives, but if anyone could confirm or clarify this I'd much appreciate it. He also asks how it could be fixed (the race condition), not looking for a full answer but any tips would be appreciated.

Upvotes: 4

Views: 6781

Answers (2)

Nominal Animal
Nominal Animal

Reputation: 39426

Apparently the intent in the course is to modify the code to

  • Use a separate thread, which receives the signals calling sigwait() or sigwaitinfo() in a loop. The signals must be blocked (first, and all along, for all threads), or the operation is unspecified (or the signals are delivered to the other thread).

    This way there is no signal handler function per se, which would be limited to async-signal safe functions. The thread that calls sigwait()/sigwaitinfo() is a perfectly normal thread, and not limited in any way related to signals or signal handlers.

    (There are other ways to receive the signals, for example using a signal handler which sets a global flag, and which the loop checks. Most of those lead to busy-waiting, running a do-nothing loop, burning CPU time uselessly: a very bad solution. The way I describe here squanders no CPU time: the kernel will put the thread to sleep when it calls sigwait()/sigwaitinfo(), and wakes it up only when a signal arrives. If you want to limit the duration of the sleep, you can use sigtimedwait() instead.)

  • Since printf() et al. are not guaranteed to be thread-safe, you should probably use a pthread_mutex_t to protect output to standard output -- in other words, so that the two threads won't try to output at exactly the same time.

    In Linux this is not necessary, since GNU C printf() (except for the _unlocked() versions) is thread-safe; each call to these functions uses an internal mutex already.

    Note that the C library may cache outputs, so to make sure the data is output, you need to call fflush(stdout);.

    The mutex is especially useful, if you want to use multiple printf(), fputs(), or similar calls atomically, without the other thread being able to inject output in between. Therefore, the mutex is recommended even if on Linux in simple cases it is not required. (And yes, you do want to do the fflush() too while holding the mutex, although it may cause the mutex to be held for a long time if the output blocks.)

I'd personally solve the entire issue completely differently -- I'd use write(STDERR_FILENO,) in the signal handler to output to standard error, and have the main program output to standard output; no threads or anything special needed, just a simple low-level write loop in the signal handler. Strictly speaking, my program would behave differently, but to the end user the results look very much the same. (Except that one could redirect the outputs to different terminal windows, and look at them side-by-side; or redirect them to helper scripts/programs which prepend nanosecond wall-clock timestamps to each input line; and other similar tricks useful when investigating things.)

Personally, I found the jump from the original problem to the "correct solution" -- if indeed what I am describing is the correct solution; I do believe it is -- to be a bit of a stretch. This approach only dawned on me when Saf mentioned that the correct solution is expected to use pthreads.

I hope you find this informative, but not a spoiler.


Edited 2013-03-13:

Here is the writefd() function I use to safely write data from a signal handler to a descriptor. I also included the wrapper functions wrout() and wrerr() that you can use to write strings to standard output or standard error, respectively.

#include <unistd.h>
#include <string.h>
#include <errno.h>

/**
 * writefd() - A variant of write(2)
 *
 * This function returns 0 if the write was successful, and the nonzero
 * errno code otherwise, with errno itself kept unchanged.
 * This function is safe to use in a signal handler;
 * it is async-signal-safe, and keeps errno unchanged.
 *
 * Interrupts due to signal delivery are ignored.
 * This function does work with non-blocking sockets,
 * but it does a very inefficient busy-wait loop to do so.
*/
int writefd(const int descriptor, const void *const data, const size_t size)
{
    const char       *head = (const char *)data;
    const char *const tail = (const char *)data + size;
    ssize_t           bytes;
    int               saved_errno, retval;

    /* File descriptor -1 is always invalid. */
    if (descriptor == -1)
        return EINVAL;

    /* If there is nothing to write, return immediately. */
    if (size == 0)
        return 0;

    /* Save errno, so that it can be restored later on.
     * errno is a thread-local variable, meaning its value is
     * local to each thread, and is accessible only from the same thread.
     * If this function is called in an interrupt handler, this stores
     * the value of errno for the thread that was interrupted by the
     * signal delivery. If we restore the value before returning from
     * this function, all changes this function may do to errno
     * will be undetectable outside this function, due to thread-locality.
    */
    saved_errno = errno;

    while (head < tail) {

        bytes = write(descriptor, head, (size_t)(tail - head));

        if (bytes > (ssize_t)0) {
            head += bytes;

        } else
        if (bytes != (ssize_t)-1) {
            errno = saved_errno;
            return EIO;

        } else
        if (errno != EINTR && errno != EAGAIN && errno != EWOULDBLOCK) {
            /* EINTR, EAGAIN and EWOULDBLOCK cause the write to be
             * immediately retried. Everything else is an error. */
            retval = errno;
            errno = saved_errno;
            return retval;
        }
    }

    errno = saved_errno;
    return 0;
}

/**
 * wrout() - An async-signal-safe alternative to fputs(string, stdout)
 *
 * This function will write the specified string to standard output,
 * and return 0 if successful, or a nonzero errno error code otherwise.
 * errno itself is kept unchanged.
 *
 * You should not mix output to stdout and this function,
 * unless stdout is set to unbuffered.
 *
 * Unless standard output is a pipe and the string is at most PIPE_BUF
 * bytes long (PIPE_BUF >= 512), the write is not atomic.
 * This means that if you use this function in a signal handler,
 * or in multiple threads, the writes may be interspersed with each other.
*/
int wrout(const char *const string)
{
    if (string)
        return writefd(STDOUT_FILENO, string, strlen(string));
    else
        return 0;
}

/**
 * wrerr() - An async-signal-safe alternative to fputs(string, stderr)
 *
 * This function will write the specified string to standard error,
 * and return 0 if successful, or a nonzero errno error code otherwise.
 * errno itself is kept unchanged.
 *
 * You should not mix output to stderr and this function,
 * unless stderr is set to unbuffered.
 *
 * Unless standard error is a pipe and the string is at most PIPE_BUF
 * bytes long (PIPE_BUF >= 512), the write is not atomic.
 * This means that if you use this function in a signal handler,
 * or in multiple threads, the writes may be interspersed with each other.
*/
int wrerr(const char *const string)
{
    if (string)
        return writefd(STDERR_FILENO, string, strlen(string));
    else
        return 0;
}

If the file descriptor refers to a pipe, writefd() can be used to write up to PIPE_BUF (at least 512) bytes atomically. writefd() can also be used in an I/O intensive application to convert signals (and if raised using sigqueue(), the related value, an integer or a pointer) to a socket or pipe output (data), making it much easier to multiplex multiple I/O streams and signal handling. A variant (with an extra file descriptor marked close-on-exec) is often used to easily detect whether a child process executed another process, or failed; otherwise it is difficult to detect which process -- the original child, or the executed process -- exited.

In the comments to this answer, there was some discussion about errno, and confusion whether the fact that write(2) modifies errno makes it unsuitable for signal handlers.

First, POSIX.1-2008 (and earlier) defines async-signal-safe functions as those that can safely be called from signal handlers. The 2.4.3 Signal actions chapter includes a list of such functions, including write(). Note that it also explicitly states that "Operations which obtain the value of errno and operations which assign a value to errno shall be async-signal-safe."

That means that POSIX.1 intends write() to be safe to use in a signal handler, and that errno can also be manipulated to avoid the interrupted thread seeing unexpected changes in errno.

Because errno is a thread-local variable, each thread has their own errno. When a signal is delivered, it always interrupts one of the existing threads in the process. Signals can be directed to a specific thread, but usually the kernel decides which thread gets a process-wide signal; it varies between systems. If there is only one thread, the initial or main thread, then obviously it is the one that gets interrupted. All this means that if the signal handler saves the value of errno it initially sees, and restores it just before it returns, the changes to errno are invisible outside the signal handler.

There is one way to detect it, though, also hinted at in POSIX.1-2008 by the careful wording:

Technically, &errno is almost always valid (depending on system, compiler, and standards applied), and yields the address of the int variable holding the error code for the current thread. It is therefore possible for another thread to monitor the error code of another thread, and yes, this thread would see the changes to it done within the signal handler. However, there is no guarantee that the other thread is able to access the error code atomically (although it is atomic on many architectures): such "monitoring" would be informative only anyway.

It is a pity that almost all signal handler examples in C use stdio.h printf() et cetera. Not only is it wrong on many levels -- from non-async-safe to the caching issues, possibly non-atomic accesses to FILE fields, if the interrupted code was also doing I/O at the same time --, but the correct solution using unistd.h similar to my example in this edit is just as simple. The rationale for using stdio.h I/O in signal handlers seems to be "it usually works". I detest that, personally, since for example violence also "usually works". I consider it stupid and/or lazy.

I hope you found this informative.

Upvotes: 5

Nominal Animal
Nominal Animal

Reputation: 39426

There is really no race condition; it is worse than that. According to the POSIX standard, the behaviour of the program is undefined (if the signal is delivered at the right moment).

Look at man 7 signal man page, specifically the section under Async-signal safe functions:

A signal handler function must be very careful, since processing elsewhere may be interrupted at some arbitrary point in the execution of the program. POSIX has the concept of "safe function". If a signal interrupts the execution of an unsafe function, and handler calls an unsafe function, then the behavior of the program is undefined.

Note that printf() is definitely not an async-signal safe function; therefore the behaviour is undefined.

In a general case, the solution is nontrivial, because there are no async-signal safe locking primitives (other than sem_post(), which by itself is not enough for this, and file locks, which would have to be used around all printf() calls). The generic, portable solution is to create a pipe using pipe() from unistd.h, and write the output to the pipe using write(), and have the main program read from the pipe and "forward" the contents. POSIX quarantees that writes shorter than PIPE_BUF are atomic, with PIPE_BUF being at least 512 (4096 in Linux) -- see man 7 pipe for details --, so this is also limited to 512-byte or shorter messages in practice for portable code.

Commonly, and in this particular case, it is enough to replace the printf() in the signal handler with setting a global volatile sigatomic_t variable. The main loop can then simply check (and clear) the global variable and output the message itself.

While the flag variable approach can lose rapidly repeated SIGUSR1 signals, it is irrelevant because you can always lose rapidly repeated SIGUSR1 signals: only one can be pending at a time, so repeated signals occurring between the first one and when it is handled, are not delivered at all! (If you were to use realtime signals like SIGRTMIN+0 which are queued, you can make sure you catch every single one by using atomic built-ins like __sync_fetch_and_and(variable,0) or __atomic_exchange_n(variable,0,__ATOMIC_SEQ_CST) in the main loop, and __sync_fetch_and_add(variable,1) or __atomic_fetch_add(variable,1,__ATOMIC_SEQ_CST) in the signal handler; both preceded by a __sync_synchronize() or __atomic_signal_fence(__ATOMIC_SEQ_CST) call to make sure the changes are immediately effective/visible to the other. But you don't need to worry about atomic ops in this case.)

There is an interesting corner case -- not a race condition -- with respect to sigset() and sigprocmask(), too. A process inherits its signal mask from its parent, with SIGUSR1 not blocked by default. Unless handled, it causes the process to terminate. Therefore, depending on the inherited signal mask, a SIGUSR1 signal delivered before the sigset() call is either blocked, or causes the process to terminate. (However, if set contained SIGUSR1; i.e. SIGUSR1 was blocked, then there would be a race condition, unless sigprocmask() was called before sigset(). However, since set is empty, sigset() is best called before sigprocmask().)

Upvotes: 7

Related Questions