Using a thread in C++ to report progress of computations

Question

I'm writing a generic abstract class to be able to report on the status of as many instance variables as we need. For instance, consider the following useless loop:

int a, b;
for (int i=0; i < 10000; ++i) {
    for (int j=0; j < 1000; ++j) {
        for (int k =0; k < 1000; ++k) {
            a = i;
            b = j;
        }
    }
}

It would be nice to be able to see the values of a and b without having to modify the loop. In the past I have written if statements such as the following:

int a, b;
for (int i=0; i < 10000; ++i) {
    for (int j=0; j < 1000; ++j) {
        for (int k =0; k < 1000; ++k) {
            a = i;
            b = j;
            if (a % 100 == 0) {
                printf("a = %d
", a);
            }
        }
    }
}

This would allow me to see the value of a every 100 iterations. However, depending on the computations being done, sometimes it is just not possible to check on the progress in this fashion. The idea is to have be able to go away from the computer, come back after a given time and check on whatever values you want to see.

To this end we can use pthreads. The following code works, and the only reason I am posting it is because I'm not sure if I'm using the thread correctly, mainly, how to shut it off.

First lets consider the file "reporter.h":

#include 
#include 
#include 

void* run_reporter(void*);

class reporter {
public: 
    pthread_t thread;
    bool stdstream;
    FILE* fp;

    struct timespec sleepTime;
    struct timespec remainingSleepTime;

    const char* filename;
    const int sleepT;
    double totalTime;

    reporter(int st, FILE* fp_): fp(fp_), filename(NULL), stdstream(true), sleepT(st) {
        begin_report();
    }
    reporter(int st, const char* fn): fp(NULL), filename(fn), stdstream(false), sleepT(st) {
        begin_report();
    }
    void begin_report() {
        totalTime = 0;
        if (!stdstream) fp = fopen(filename, "w");
        fprintf(fp, "reporting every %d seconds ...
", sleepT);
        if (!stdstream) fclose(fp);
        pthread_create(&thread, NULL, run_reporter, this);
    }
    void sleep() {
        sleepTime.tv_sec=sleepT;
        sleepTime.tv_nsec=0;
        nanosleep(&sleepTime, &remainingSleepTime);
        totalTime += sleepT;
    }
    virtual void report() = 0;
    void end_report() {
        pthread_cancel(thread);
        // Wrong addition of remaining time, needs to be fixed
        // but non-important at the moment.
        //totalTime += sleepT - remainingSleepTime.tv_sec;
        long sec = remainingSleepTime.tv_sec;
        if (!stdstream) fp = fopen(filename, "a");
        fprintf(fp, "reported for %g seconds.
", totalTime);
        if (!stdstream) fclose(fp);
    }
};

void* run_reporter(void* rep_){
    reporter* rep = (reporter*)rep_;
    while(1) {
        if (!rep->stdstream) rep->fp = fopen(rep->filename, "a");
        rep->report();
        if (!rep->stdstream) fclose(rep->fp);
        rep->sleep();
    }
}

This file declares the abstract class reporter, notice the pure virtual function report. This is the function that will print the messages. Each reporter has its own thread and the thread gets created when the reporter constructor is called. To use the reporter object in our useless loop now we can do:

#include "reporter.h"
int main() {
    // Declaration of objects we want to track
    int a = 0;
    int b = 0;
    // Declaration of reporter
    class prog_reporter: public reporter {
    public:
        int& a;
        int& b;
        prog_reporter(int& a_, int& b_):
            a(a_), b(b_),
            reporter(3, stdout)
        {}
        void report() {
            fprintf(fp, "(a, b) = (%d, %d)
", this->a, this->b);
        }
    };
    // Start tracking a and b every 3 seconds
    prog_reporter rep(a, b);

    // Do some useless computation
    for (int i=0; i < 10000; ++i) {
        for (int j=0; j < 1000; ++j) {
            for (int k =0; k < 1000; ++k) {
                a = i;
                b = j;
            }
        }
    }
    // Stop reporting
    rep.end_report();
}

After compiling this code (no optimization flag) and running it I obtain:

macbook-pro:Desktop jmlopez$ g++ testing.cpp
macbook-pro:Desktop jmlopez$ ./a.out 
reporting every 3 seconds ...
(a, b) = (0, 60)
(a, b) = (1497, 713)
(a, b) = (2996, 309)
(a, b) = (4497, 478)
(a, b) = (5996, 703)
(a, b) = (7420, 978)
(a, b) = (8915, 78)
reported for 18 seconds.

This does exactly what I wanted it to do, with the optimization flags then I get:

macbook-pro:Desktop jmlopez$ g++ testing.cpp -O3
macbook-pro:Desktop jmlopez$ ./a.out 
reporting every 3 seconds ...
(a, b) = (0, 0)
reported for 0 seconds.

Which is not surprising since the compiler probably rewrote my the code to give me the same answer in a shorter amount of time. My original question was going to be why the reporter did not give me the values of the variables if I made the loops longer, for instance:

for (int i=0; i < 1000000; ++i) {
    for (int j=0; j < 100000; ++j) {
        for (int k =0; k < 100000; ++k) {
            a = i;
            b = j;
        }
    }
}

After running the code again with the optimization flag:

macbook-pro:Desktop jmlopez$ g++ testing.cpp -O3
macbook-pro:Desktop jmlopez$ ./a.out 
reporting every 3 seconds ...
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
(a, b) = (0, 0)
reported for 39 seconds.

Question: Is this output due to the optimization flag which modifies the code and it simply decides not to update the variables til the very end?

Main question:

In the reporter method end_report I call the function pthread_cancel. After reading the following answer it made me doubtful about the use of the function and how I was terminating the reporting thread. For those experienced with pthreads, is there any obvious holes or potential problems using the thread as I have done?

sonicwave · Accepted Answer

About the main question: You're close. Add a call to pthread_join() (http://linux.die.net/man/3/pthread_join) after pthread_cancel(), and everything should be fine.

The join call makes sure that you clean up the threads resources, and, if forgotten, can lead to running out of threading resources in certain cases.

And just to add, the important point when using pthread_cancel() (apart from remembering to join the thread) is to make sure that the thread you are canceling has a so-called cancellation point, which your thread does by calling nanosleep() (and possibly also fopen, fprintf and fclose which may be cancellation points). If no cancellation point exists, your thread will just keep running.

Using a thread in C++ to report progress of computations

Main question:

Answers (2)

Related Questions