pthread_cond_timedwait() usage for cancelling lengthy task

Question

I have a situation where I would like to cancel a thread if it takes too much to complete. For this, I am using a second thread that waits for the first thread to finish, but not more than a number of seconds. The pthread_cond_timedwait() function seems to fit perfectly my usage scenario, however it doesn't seem to behave as I would've expected it to. More specifically, even though the pthread_cond_timedwait() function returns ETIMEDOUT, it does so only after the thread that it was supposed to cancel finishes, which defeats the whole purpose.

This is my test code:

    #include 
    #include 
    #include 
    #include 
    #include 

    #define WAIT_INTERVAL 5
    #define THREAD_SLEEP 10

    pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
    pthread_cond_t condition = PTHREAD_COND_INITIALIZER;

    pthread_t t1;
    pthread_t t2;

    void* f1(void*);
    void* f2(void*);

    int main()
    {
        pthread_create(&t1, NULL, &f1, NULL);
        pthread_create(&t2, NULL, &f2, NULL);

        pthread_join(t1, NULL);
        pthread_join(t2, NULL);

        std::cout << "Thread(s) successfully finished" << std::endl << std::flush;

        exit(EXIT_SUCCESS);
    }

    void* f1(void*)
    {
        pthread_mutex_lock(&mutex);
        timespec ts = {0};
        clock_gettime(CLOCK_REALTIME, &ts);
        ts.tv_sec += WAIT_INTERVAL;
        std::cout << __FUNCTION__ << ": Waiting for at most " << WAIT_INTERVAL << " seconds starting now" << std::endl << std::flush;
        int waitResult = pthread_cond_timedwait(&condition, &mutex, &ts);
        if (waitResult == ETIMEDOUT)
        {
            std::cout << __FUNCTION__ << ": Timed out" << std::endl << std::flush;
            int cancelResult = pthread_cancel(t2);
            if (cancelResult)
            {
                std::cout << __FUNCTION__ << ": Could not cancel T2 : " << strerror(cancelResult) << std::endl << std::flush;
            }
            else
            {
                std::cout << __FUNCTION__ << ": Cancelled T2" << std::endl << std::flush;
            }
        }
        std::cout << __FUNCTION__ << ": Finished waiting with code " << waitResult << std::endl << std::flush;
        pthread_mutex_unlock(&mutex);
    }

    void* f2(void*)
    {
        pthread_mutex_lock(&mutex);
        std::cout << __FUNCTION__ << ": Started simulating lengthy operation for " << THREAD_SLEEP << " seconds" << std::endl << std::flush;
        sleep(THREAD_SLEEP);
        std::cout << __FUNCTION__ << ": Finished simulation, signaling the condition variable" << std::endl << std::flush;
        pthread_cond_signal(&condition);
        pthread_mutex_unlock(&mutex);
    }

The output I get from the above code is:

    f1: Waiting for at most 5 seconds starting now
    f2: Started simulating lengthy operation for 10 seconds
    f2: Finished simulation, signaling the condition variable
    f1: Timed out
    f1: Could not cancel T2 : No such process
    f1: Finished waiting with code 110
    Thread(s) successfully finished

Given that this is my first time with POSIX threads, I think I'm missing something which may be pretty obvious.

I have read numerous tutorials, articles and answers about this, but none covers my use case and none offered any hint.

Please note that, for brevity, I have removed some of the code that handled the predicate mentioned in the pthread_cond_timedwait manual, as that doesn't change anything in the behaviour.

I am using POSIX threads on a CentOS 6.5 machine. My development&test environment: 2.6.32-431.5.1.el6.centos.plus.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

Compilation command: g++ -o executable_binary -pthread -lrt source_code.cpp

stefaanv · Accepted Answer

Edit: I first adviced against using pthread_cond_timedwait, but I think in this situation it is okay so the first thread doesn't wait longer than needed, although instead of checking the return value, I would check a 'finished' flag, which is set by the second thread and protected by the mutex.

The problem in your example is that the mutex is taken by the first thread and the mutex is released by the pthread_cond_timedwait() call. It is then taken by the second thread, thus blocking the first until the second thread releases the mutex at the end.

pthread_cond_timedwait() usage for cancelling lengthy task

Answers (2)

Related Questions