How to collect the event value every time the CUDA kernel function been invoked with nvprof?

Question

Profiling CUDA programs with nvprof.

I have discribed the problem in How to collect the event value every time the kernel function been invocated?

I post the problem again.

With nvprof --events tex0_cache_sector_queries --replay-mode kernel ./matrixMul,

or nvprof --events tex0_cache_sector_queries --replay-mode application ./matrixMul,

that we can collect the event values result:

==40013== Profiling application: ./matrixMul
==40013== Profiling result:
==40013== Event result:
"Device","Kernel","Invocations","Event Name","Min","Max","Avg","Total"
"Tesla K80 (0)","void matrixMulCUDA(float*, float*, float*, int, int)",301,"tex0_cache_sector_queries",0,30,24,7224

Above result is a summary. The 301 times invocation value of tex0_cache_sector_queries invocated by kernel function matrixMulCUDA. It just has the min, max, avg, total value of the 301 times invocation, that is a summary result.

I want to collect the complete 301 times tex0_cache_sector_queries values which from every time the matrixMulCUDA been invoked. On the other hand, every time the kernel function matrixMulCUDA been invoked, I want to collect the tex0_cache_sector_queries event value. How to collect it?

How to collect the event value every time the CUDA kernel function been invoked with nvprof?

Answers (1)

Related Questions