sarath
sarath

Reputation: 543

Code execution time in Linux

I was trying to get the execution time of a particular piece of code (may be a loop or a function, etc.). I heard that command time or function clock() do the job. But my requirement was accuracy in milli/micro seconds. So I wrote something like this.

int main()
{
    struct timeval ts1, ts2;
    long long time1, time2, diff;
    int i,var;

    scanf("%d",&var);
    gettimeofday(&ts1, NULL);
    time1 = (ts1.tv_sec * 1000000) + ts1.tv_usec;

    for (i=0; i<var; i++); // <-- Trying to measure execution time for the loop

    gettimeofday(&ts2, NULL);
    time2 = (ts2.tv_sec * 1000000) + ts2.tv_usec;

    printf("-------------------------\n");
    diff = time2 - time1;
    printf("total %ld microseconds\n", diff);
    printf("%ld seconds\n", diff/1000000);
    diff %= 1000000;
    printf("%ld milliseconds\n", diff/1000);
    diff %= 1000;
    printf("%ld microseconds\n", diff);
    printf("-------------------------\n");
    return 0;
}

I have two concerns here

  1. Is the above code reliable and do what my intention is? I'm not quite sure about it ;)
  2. When I compile the code with optimization level -O2, that's not at all working. I know -O2 will apply some make-up but how I'll see what happened? If I'm good to go with 1, can anyone please suggest how'll recover the O2 issue?

Appreciate the help! Thanks.

Upvotes: 0

Views: 535

Answers (4)

jfly
jfly

Reputation: 8010

The above code you show is to get the real time elapsed since gettimeofday() just returns the wall-clock time. As to not working with optimization level -O2, declare i as volatile int i which will prevent the optimization to i.

Upvotes: 1

Emmet
Emmet

Reputation: 6421

Variable rate CPU clocks and exploitation of thermal headroom have left me with an increasing suspicion that wall-clock timing in seconds for functions that don't run for long enough to heat the core up are probably not as useful as cycle counts.

If I'm instrumenting my own code, I tend to prefer something like:

static __inline__ uint64_t rdtsc(void)
{
    uint32_t hi, lo;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
    return ( (uint64_t)lo)|( ((uint64_t)hi)<<32 );
}

Using this, I can record the TSC value before and after a function call, subtract the two, and get the number of cycles spent.

If you want wallclock time, you can use clock_gettime() from from time.h, which will give you nanosecond resolution if not accuracy, and use the following to subtract the two (before and after) struct timespec objects:

#define NSEC_PER_SEC 1000000000
static int timespec_subtract(result, x, y)
struct timespec *result, *x, *y;
{
    /* Perform the carry for the later subtraction by updating y. */
    if (x->tv_nsec < y->tv_nsec) {
        int nsec = (y->tv_nsec - x->tv_nsec) / NSEC_PER_SEC + 1;
        y->tv_nsec -= NSEC_PER_SEC * nsec;
        y->tv_sec += nsec;
    }
    if (x->tv_nsec - y->tv_nsec > NSEC_PER_SEC) {
        int nsec = (x->tv_nsec - y->tv_nsec) / NSEC_PER_SEC;
        y->tv_nsec += NSEC_PER_SEC * nsec;
        y->tv_sec -= nsec;
    }

    /* Compute the time remaining to wait.
       tv_nsec is certainly positive. */
    result->tv_sec = x->tv_sec - y->tv_sec;
    result->tv_nsec = x->tv_nsec - y->tv_nsec;

    /* Return 1 if result is negative. */
    return x->tv_sec < y->tv_sec;
}

That said, I tend to use perf and avoid instrumentation altogether.

Upvotes: 0

DOOM
DOOM

Reputation: 1244

You can use the sample code !!. The code results in no-overhead in the computation cost

#include <sys/time.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/resource.h>

void timing(double* wcTime, double* cputime)
{
    struct timeval tp;

    gettimeofday(&tp, NULL);
    *wcTime=(double) (tp.tv_sec + tp.tv_usec/1000000.0);

    struct rusage ruse;
    getrusage(RUSAGE_SELF, &ruse);
    *cpuTime=(double)(ruse.ru_utime.tv_sec+ruse.ru_utime.tv_usec / 1000000.0);
}

TO USE:

double  wcs,    //  Wall Clock Start
        wce,    //  Wall Clock End
        ccs,    //  CPU Clock Start
        cce;    //  CPU Clock End
timing(&wcs, &ccs);

//  COMPUTATION CODE

timing(&wce, &cce);

cout << "CPU RUNTIME:       " << cce - ccs << endl
     << "WALL CLOCK TIME:   " << wce - wcs << endl;

Upvotes: 0

andreaplanet
andreaplanet

Reputation: 771

This NanoTimer class (header file) should do the job. Use startTimer()/stopTimer(). Please note that calculating the elapsed time at this relative resolution takes some time, so you will never have a value of 0 if you execute just startTimer(); stopTimer(); without any code in the middle. Also there are many other factors that influences the elapsed time, so you should repeat the specific measure several times and take the lowest value.

class NanoTimer
{
    struct timespec ts_;
    u_int64_t startTimer_;
    u_int64_t totalTimer_;
public:
    NanoTimer()
    {
        totalTimer_ = 0;
        startTimer_ = 0;
    }

    u_int64_t getNanoSecTimer(void)
    {
        clock_gettime(CLOCK_REALTIME, &ts_);
        return ts_.tv_sec * 1000000000 +  ts_.tv_nsec;
    }

    void startTimer(void)
    {
        startTimer_ = getNanoSecTimer();
    }
    void stopTimer(void)
    {
        //assert(startTimer_ > 0);
        totalTimer_ += getNanoSecTimer() - startTimer_;
        startTimer_ = 0;
    }
    inline u_int32_t getTotalSeconds()
    {
        return totalTimer_/1000000000;
    }
    inline u_int32_t getTotalMilliseconds()
    {
        return totalTimer_/1000000;
    }
    inline u_int32_t getTotalMicroseconds()
    {
        return totalTimer_/1000;
    }
    inline u_int32_t getTotalNanoseconds()
    {
        return totalTimer_;
    }
    inline u_int32_t getCurrentSeconds()
    {
        return (totalTimer_ + (startTimer_ > 0 ? getNanoSecTimer() - startTimer_ : 0)) / 1000000000;
    }
};

Upvotes: 1

Related Questions