ImpendingShroom
ImpendingShroom

Reputation: 33

How do I stop threads stalling on pthread_join?

I've got a project where I'm adding jobs to a queue and I have multiple threads taking jobs, and calculating their own independent results.

My program handles the SIGINT signal and I'm attempting to join the threads to add up the results, print to screen, and then exit. My problem is is that the threads either seem to stop functioning when I send the signal, or they're getting blocked on the mutex_lock. Here are the important parts of my program so as to be concise.

main.c

//the thread pool has a queue of jobs inside
//called jobs (which is a struct)
struct thread_pool * pool;

void signal_handler(int signo) {
    pool->jobs->running = 0; //stop the thread pool
    pthread_cond_broadcast(pool->jobs->cond);

    for (i = 0; i < tpool->thread_count; i++) {
        pthread_join(tpool->threads[i], retval);
        //do stuff with retval
    }

    //print results then exit
    exit(EXIT_SUCCESS);
}

int main() {
    signal(SIGINT, signal_handler);
    //set up threadpool and jobpool
    //start threads (they all run the workerThread function)
    while (1) {
        //send jobs to the job pool
    }
    return 0;
}

thread_stuff.c

void add_job(struct jobs * j) {
    if (j->running) {
        pthread_mutex_lock(j->mutex);
        //add job to queue and update count and empty
        pthread_cond_signal(j->cond);
        pthread_mutex_unlock(j->mutex);
    }
}

struct job * get_job(struct jobs * j) {

    pthread_mutex_lock(j->mutex);

    while (j->running && j->empty)
        pthread_cond_wait(j->cond, j->mutex);

    if (!j->running || j->empty) return NULL;

    //get the next job from the queue
    //unlock mutex and send a signal to other threads
    //waiting on the condition
    pthread_cond_signal(j->cond);
    pthread_mutex_unlock(j->mutex);
    //return new job
}

void * workerThread(void * arg) {
    struct jobs * j = (struct jobs *) arg;
    int results = 0;
    while (j->running) {
        //get next job and process results
    }
    return results;
}

Thanks for your help, this is giving me a real headache!

Upvotes: 0

Views: 693

Answers (1)

Florian Weimer
Florian Weimer

Reputation: 33717

You should not call pthread_cond_wait or pthread_join from a signal handler which handles asynchronously generated signals such as SIGINT. Instead, you should block SIGINT for all threads, spawn a dedicated thread, and call sigwait there. This means that you detect the arrival of the SIGINT signal outside of a signal handler context, so that you are not restricted to async-signal-safe functions. You also avoid the risk of self-deadlock in case the signal is delivered to one of the worker threads.

At this point, you just need to shut down your work queue/thread pool in an orderly manner. Depending on the details, your existing approach with the running flag might even work unchanged.

Upvotes: 3

Related Questions