mfaieghi
mfaieghi

Reputation: 610

Timer resolution in OpenCL profiling

I need some clarification on timer resolution. I'm trying to learn profiling in openCL. I have reduction algorithm implemented in OpenCL and want to measure the execution kernel time by getting the total elapsed time in the code given below. I ran this code on different devices and here are the results:

On CPU -- AMD FX 770K Total time = 352,855,601 CL_DEVICE_PROFILING_TIMER_RESOLUTION = 69 ns

On GPU -- AMD Radeon R7 240 Total time = 172,297 CL_DEVICE_PROFILING_TIMER_RESOLUTION = 1 ns

On another GPU -- GeForce GT 610 Total time = 1,725,504 CL_DEVICE_PROFILING_TIMER_RESOLUTION = 1000 ns

The "Total time" given above is in actual nanoseconds? or I need to divide them by the time resolution to get the actual execution time? How the timer resolution can help us?

Here is a part of the code:

/* Enqueue kernel */
        err = clEnqueueNDRangeKernel(queue, kernel[i], 1, NULL, &global_size,
            &local_size, 0, NULL, &prof_event);
        if (err < 0) {
            perror("Couldn't enqueue the kernel");
            exit(1);
        }

/* Finish processing the queue and get profiling information */
        clFinish(queue);
        clGetEventProfilingInfo(prof_event, CL_PROFILING_COMMAND_START,
            sizeof(time_start), &time_start, NULL);
        clGetEventProfilingInfo(prof_event, CL_PROFILING_COMMAND_END,
            sizeof(time_end), &time_end, NULL);
        total_time = time_end - time_start;


printf("Total time = %lu\n\n", total_time);

Upvotes: 1

Views: 762

Answers (1)

Dithermaster
Dithermaster

Reputation: 6343

The specification is pretty clear on this: "current device time counter in nanoseconds"

The times are always in nanoseconds. The resolution query is so you can find out how accurate the data is. For example, given the measurements and resolutions you posted, you can deduce the the error margin of the measure:

AMD FX 770K:

  • Measured: 352,855,601 ± 69 ns
  • Actual: 352,855,532 - 352,855,670

AMD Radeon R7 240:

  • Measured: 172,297 ± 1 ns
  • Actual: 172,296 - 172,298

GeForce GT 610:

  • Measured: 1,725,504 ± 1000 ns
  • Actual: 1,724,504 - 1,726,504

Upvotes: 4

Related Questions