How to observe CUDA events and metrics for a subsection of an executable (e.g. only during a kernel execution time)?

Question

I'm familiar with using nvprof to access the events and metrics of a benchmark, e.g.,

nvprof --system-profiling on --print-gpu-trace -o (file name) --events inst_issued1 ./benchmarkname

The

system-profiling on --print-gpu-trace -o (filename)

command gives timestamps for start time, kernel end times, power, temp and saves the info an nvvp files so we can view it in the visual profiler. This allows us to see what's happening in any section of a code, in particular when a specific kernel is running. My question is this--

Is there a way to isolate the events counted for only a section of the benchmark run, for example during a kernel execution? In the command above,

--events inst_issued1

just gives the instructions tallied for the whole executable. Thanks!

travelingbones · Accepted Answer

After looking into this a bit more, it turns out that kernel level information is also given for all kernels (w/o using --kernels and specifying them specifically) by using

nvprof --events  --metrics  ./

In fact, it gives output of the form

"Device","Kernel","Invocations","Event Name","Min","Max","Avg"

If a kernel is called multiple times in the benchmark, this allows you to see the Min, Max, Avg of the desired events for those kerne runs. Evidently the --kernels option on Cuda 7.5 Profiler allows each run of each kernel to be specified.

How to observe CUDA events and metrics for a subsection of an executable (e.g. only during a kernel execution time)?

Answers (2)

Related Questions