Fara Importanta
Fara Importanta

Reputation: 161

pthread_cond_timedwait() usage for cancelling lengthy task

I have a situation where I would like to cancel a thread if it takes too much to complete. For this, I am using a second thread that waits for the first thread to finish, but not more than a number of seconds. The pthread_cond_timedwait() function seems to fit perfectly my usage scenario, however it doesn't seem to behave as I would've expected it to. More specifically, even though the pthread_cond_timedwait() function returns ETIMEDOUT, it does so only after the thread that it was supposed to cancel finishes, which defeats the whole purpose.

This is my test code:

    #include <unistd.h>
    #include <stdlib.h>
    #include <errno.h>
    #include <iostream>
    #include <cstring>

    #define WAIT_INTERVAL 5
    #define THREAD_SLEEP 10

    pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
    pthread_cond_t condition = PTHREAD_COND_INITIALIZER;

    pthread_t t1;
    pthread_t t2;

    void* f1(void*);
    void* f2(void*);

    int main()
    {
        pthread_create(&t1, NULL, &f1, NULL);
        pthread_create(&t2, NULL, &f2, NULL);

        pthread_join(t1, NULL);
        pthread_join(t2, NULL);

        std::cout << "Thread(s) successfully finished" << std::endl << std::flush;

        exit(EXIT_SUCCESS);
    }

    void* f1(void*)
    {
        pthread_mutex_lock(&mutex);
        timespec ts = {0};
        clock_gettime(CLOCK_REALTIME, &ts);
        ts.tv_sec += WAIT_INTERVAL;
        std::cout << __FUNCTION__ << ": Waiting for at most " << WAIT_INTERVAL << " seconds starting now" << std::endl << std::flush;
        int waitResult = pthread_cond_timedwait(&condition, &mutex, &ts);
        if (waitResult == ETIMEDOUT)
        {
            std::cout << __FUNCTION__ << ": Timed out" << std::endl << std::flush;
            int cancelResult = pthread_cancel(t2);
            if (cancelResult)
            {
                std::cout << __FUNCTION__ << ": Could not cancel T2 : " << strerror(cancelResult) << std::endl << std::flush;
            }
            else
            {
                std::cout << __FUNCTION__ << ": Cancelled T2" << std::endl << std::flush;
            }
        }
        std::cout << __FUNCTION__ << ": Finished waiting with code " << waitResult << std::endl << std::flush;
        pthread_mutex_unlock(&mutex);
    }

    void* f2(void*)
    {
        pthread_mutex_lock(&mutex);
        std::cout << __FUNCTION__ << ": Started simulating lengthy operation for " << THREAD_SLEEP << " seconds" << std::endl << std::flush;
        sleep(THREAD_SLEEP);
        std::cout << __FUNCTION__ << ": Finished simulation, signaling the condition variable" << std::endl << std::flush;
        pthread_cond_signal(&condition);
        pthread_mutex_unlock(&mutex);
    }

The output I get from the above code is:

    f1: Waiting for at most 5 seconds starting now
    f2: Started simulating lengthy operation for 10 seconds
    f2: Finished simulation, signaling the condition variable
    f1: Timed out
    f1: Could not cancel T2 : No such process
    f1: Finished waiting with code 110
    Thread(s) successfully finished

Given that this is my first time with POSIX threads, I think I'm missing something which may be pretty obvious.

I have read numerous tutorials, articles and answers about this, but none covers my use case and none offered any hint.

Please note that, for brevity, I have removed some of the code that handled the predicate mentioned in the pthread_cond_timedwait manual, as that doesn't change anything in the behaviour.

I am using POSIX threads on a CentOS 6.5 machine. My development&test environment: 2.6.32-431.5.1.el6.centos.plus.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

Compilation command: g++ -o executable_binary -pthread -lrt source_code.cpp

Upvotes: 4

Views: 981

Answers (2)

stefaanv
stefaanv

Reputation: 14392

Edit: I first adviced against using pthread_cond_timedwait, but I think in this situation it is okay so the first thread doesn't wait longer than needed, although instead of checking the return value, I would check a 'finished' flag, which is set by the second thread and protected by the mutex.

The problem in your example is that the mutex is taken by the first thread and the mutex is released by the pthread_cond_timedwait() call. It is then taken by the second thread, thus blocking the first until the second thread releases the mutex at the end.

Upvotes: 4

GMasucci
GMasucci

Reputation: 2882

you are setting up the two threads with

 pthread_create(&t1, NULL, &f1, NULL);
 pthread_create(&t2, NULL, &f2, NULL);

Instead of only joining them I would use thread t1 to cancel t2: in thread t1 add a line which reads pthread_cancel(t2) after your timer has elapsed.

This will send a message to t2 telling it to terminate. You can leave the two join statements in place and that will mean that t1 will patiently wait for t2 to complete its death-throes before carrying on :)

Let me know if you need more info :)

Upvotes: 1

Related Questions