koalo
koalo

Reputation: 2323

Force program termination if threads block in C++

When a class is responsible for managing a thread, it is a common pattern (see for example here) to join this thread in the destructor after you have made sure that the thread will finish in time. However, this is not always trivial as outlined in the linked thread leading to a program that never terminates if done incorrectly. Given below is an example to reproduce such a situation:

#include <iostream>
#include <thread>
#include <chrono>
using namespace std::chrono_literals;

class Foo {
public:
    Foo() {
        mythread = std::thread([&](){
        int i = 0;
            while(running) {
                std::cout << "hi" << std::endl;
                if (i++ >= 2) {
                    // placeholder for e.g. a blocking condition variable
                    std::this_thread::sleep_for(1000h);
                }
                std::this_thread::sleep_for(500ms);
            }
        });
    }

    ~Foo() {
        running = false;
        mythread.join();
    }

private:
    std::thread mythread;
    bool running{true};
};

int main() {
    Foo bar;
    std::this_thread::sleep_for(1s);

    // enabling this line will block the termination
    //std::this_thread::sleep_for(2s);

    std::cout << "ending" << std::endl;
}

What I am searching for is a solution that forcefully terminates the program if this situation occurs. Of course, one should always strive towards finishing the thread properly, but having such feature would be good as last resort to have a peace of mind, especially for unobserved embedded systems where crashing programs can be easier restored and debugged than blocking programs.

A rough solution draft would be to start a thread at the end of the main that sleeps for a few seconds and if the program has not ended after that time, std::terminate is called (and ideally a corresponding error is reported). However, we have a chicken-or-egg problem because this new thread will of course keep the program from ending in time. I would highly appreciate any ideas.

EDIT: The solution should not require modification of the Foo class itself so that it also covers respective bugs in unmodified code of e.g. external libraries. Ideally, it would even cover threads no class feels responsible for ending them before the main ends (classes with static storage duration or even no longer referenced objects with dynamic storage duration), but that might not be possible at all without in-depth OS hacking or an external process monitor.

Upvotes: 0

Views: 285

Answers (1)

Ihor Drachuk
Ihor Drachuk

Reputation: 1293

There are several solutions:

  • Investigate and fix the root problem (this is the best and correct solution)

Workarounds:

  • You can notify from thread about exiting via condition variable. And only after it do join. If CV's wait_for returns with timeout - kill thread (bad solution, there are another problems).
  • You can create watch-thread, which will verify time-counter. Counter should be reset from time to time by the application. If watch-thread detects too high value in time-counter, it restarts whole the application.
  • Move suspicious code out of your application to separate process and communicate with it via IPC. In case of problems - restart that application (best among the workarounds)

Upvotes: 1

Related Questions