othane
othane

Reputation: 648

protect sem_wait() from signals using pthread_sigmask()

I have a library that accesses a hardware resource (SPI) via a 3rd party library. My library, and in turn the SPI resource, is accessed by multiple processes so I need to lock the resource with semaphores, the lock functions are below:

static int spi_lock(void)
{
    struct timespec ts;

    if (clock_gettime(CLOCK_REALTIME, &ts) == -1)
    {
        syslog(LOG_ERR,"failed to read clock: %s\n", SPISEM, strerror(errno));
        return 3;
    }
    ts.tv_sec += 5;
    if (sem_timedwait(bcoms->spisem, &ts) == -1)
    {
        syslog(LOG_ERR,"timed out trying to acquire %s: %s\n", SPISEM, strerror(errno));
        return 1;
    }
    return 0;
}

static int spi_unlock(void)
{
    int ret = 1;

    if (sem_post(bcoms->spisem))
    {
        syslog(LOG_ERR,"failed to release %s: %s\n", SPISEM, strerror(errno));
        goto done;
    }
    ret = 0;
done:
    return ret;
}

Now my problem is the library is used in a daemon and that daemon is stopped via a kill signal. Sometimes I get the kill signal while I am holding the semaphore lock and hence the servers cannot be restarted successfully because the lock is perpetually taken. To fix this I am trying to block the signals as shown below (I am waiting for hardware to test this on atm):

static int spi_lock(void)
{
    sigset_t nset;
    struct timespec ts;

    sigfillset(&nset);
    sigprocmask(SIG_BLOCK, &nset, NULL);

    if (clock_gettime(CLOCK_REALTIME, &ts) == -1)
    {
        syslog(LOG_ERR,"failed to read clock: %s\n", SPISEM, strerror(errno));
        return 3;
    }
    ts.tv_sec += 5; // 5 seconds to acquire the semaphore is HEAPS, so we better bloody get it !!! 
    if (sem_timedwait(bcoms->spisem, &ts) == -1)
    {
        syslog(LOG_ERR,"timed out trying to acquire %s: %s\n", SPISEM, strerror(errno));
        return 1;
    }
    return 0;
}

static int spi_unlock(void)
{
    sigset_t nset;
    int ret = 1;

    if (sem_post(bcoms->spisem))
    {
        syslog(LOG_ERR,"failed to release %s: %s\n", SPISEM, strerror(errno));
        goto done;
    }

    sigfillset(&nset);
    sigprocmask(SIG_UNBLOCK, &nset, NULL);
    ret = 0;
done:
    return ret;
}

But having read the man pages for sigprocmask() it says in a multi-threaded system to use pthread_sigmask(), and one of the servers I want to protect is will be multi threaded. What I don't understand is if I use pthread_sigmask() in the library, and the main parent thread spawns a SPI read thread that uses those locking functions in my library, the read thread will be protected, but can't the main thread still receive the kill signal and take down the daemon while I am holding the mutex with the signals disabled on the read thread getting me no where? If so is there a better solution to this locking problem?

Thanks.

Upvotes: 3

Views: 441

Answers (2)

kaylum
kaylum

Reputation: 14046

I don't think your approach will work. You can not block SIGKILL or SIGSTOP. Unless you are saying that the daemon is getting a different signal (like SIGHUP). But even then I think it's bad practice to block all signals from a library call. That can result in adverse effects on the calling application. For example, the application may be relying on particular signals and missing any such signals could cause it to function incorrectly.

As it turns out there probably isn't an easy way to solve your problem using semaphores. So an alternative approach is to use something like "flock" instead. That solves your problem because it is based on open file descriptors. If a process dies holding an flock the associated file descriptor will be automatically closed and hence will free the flock.

Upvotes: 0

R.. GitHub STOP HELPING ICE
R.. GitHub STOP HELPING ICE

Reputation: 215327

Indeed you've analyzed the problem correctly - masking signals does not protect you. Masking signals is not the right tool to prevent process termination with shared data (like files or shared semaphores) in an inconsistent state.

What you probably should be doing, if you want to exit gracefully on certain signals, is having the program install signal handlers to catch the termination request and feed it into your normal program logic. There are several approaches you can use:

  1. Send the termination request over a pipe to yourself. This works well if your program is structured around a poll loop that can wait for input on a pipe.

  2. Use sem_post, the one async-signal-safe synchronization function, to report the signal to the rest of the program.

  3. Start a dedicated signal-handling thread from the main thread then block all signals in the main thread (and, by inheritance, all other new threads). This thread can just do for(;;) pause(); and since pause is async-signal-safe, you can call any functions you want from the signal handlers -- including the pthread sync functions needed for synchronizing with other threads.

Note that this approach will still not be "perfect" since you can never catch or block SIGKILL. If a user decides to kill your process with SIGKILL (kill -9) then the semaphore can be left in a bad state and there's nothing you can do.

Upvotes: 2

Related Questions