Reputation: 4811

Which timer to use when comparing C code to CUDA code?

I'm currently doing two implementations of an algorithm, one in C and the other in CUDA, and am planning to do a comparison between the two in terms of runtime. My question is, what would be the best C timer to use considering I'm going to be comparing runtimes in C and CUDA. For CUDA, I shall be using Events, and I've read about wall clock timers in C such as clock() and gettimeofday() as well as high-resolution timers such as clock_gettime(), but am unsure which C one to use if I'm going to be comparing my C times against CUDA times?

Thanks :-)

Upvotes: 1

Answers (4)

Prince

Reputation: 33

I've used the following code with great/accurate success:

#include <time.h>

long unsigned int get_tick()
{
  struct timespec ts;
  if (clock_gettime(CLOCK_MONOTONIC, &ts) != 0) return (0);
  return ts.tv_sec*(long int)1000 + ts.tv_nsec / (long int) 1000000;
}

Then in the code you want to time put the get_tick method before and after it and subtract the two variables to get the result. Divide the answer by 1000 to get it in seconds

Upvotes: 0

Bardia

Reputation: 393

#include "time.h"

clock_t init, final;

init=clock();

...
//your sequential algoritm
...

final=clock()-init;
float seq_time ((double)final / ((double)CLOCKS_PER_SEC));
printf("\nThe sequential duration is %f seconds.", seq_time);

//Clock is initialized again
init=clock();

...
//your parallel algoritm
...

final=clock()-init;
float par_time ((double)final / ((double)CLOCKS_PER_SEC));
printf("\nThe parallel duration is %f seconds.", par_time);

printf("\n\nSpped up is %f seconds. (%dX Faster)", (seq_time - par_time), ((int)(seq_time / par_time)));

Upvotes: 0

njuffa

Reputation: 26205

For end-to-end measurements at application level, I would recommend using a high-precision host timer, as in the code below, which I have used for well over a decade. For detailed measurements of potentially extremely short GPU activity, I would suggest using CUDA events.

#if defined(_WIN32)
#if !defined(WIN32_LEAN_AND_MEAN)
#define WIN32_LEAN_AND_MEAN
#endif
#include <windows.h>
double second (void)
{
    LARGE_INTEGER t;
    static double oofreq;
    static int checkedForHighResTimer;
    static BOOL hasHighResTimer;

    if (!checkedForHighResTimer) {
        hasHighResTimer = QueryPerformanceFrequency (&t);
        oofreq = 1.0 / (double)t.QuadPart;
        checkedForHighResTimer = 1;
    }
    if (hasHighResTimer) {
        QueryPerformanceCounter (&t);
        return (double)t.QuadPart * oofreq;
    } else {
        return (double)GetTickCount() * 1.0e-3;
    }
}
#elif defined(__linux__) || defined(__APPLE__)
#include <stddef.h>
#include <sys/time.h>
double second (void)
{
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return (double)tv.tv_sec + (double)tv.tv_usec * 1.0e-6;
}
#else
#error unsupported platform
#endif

Upvotes: 3

jleahy

Reputation: 16895

It's probably best just to stick to something relatively simple, I'd recommend gettimeofday, which will provide a timestamp with microsecond accuracy. Just record the time before and after doing your computation, then subtract the two. You can use the timersub macro to do this.

http://linux.die.net/man/2/gettimeofday

http://linux.die.net/man/3/timercmp

Upvotes: 1

Which timer to use when comparing C code to CUDA code?

Answers (4)

Related Questions