How to setup a dedicated GPU in order to benchmark a CUDA kernel?

Question

I want to use second GPU device as a dedicate device under linux, in order to benchmark a kernel.

The kernel that I am testing is a SIMD computing kernel without reductions and not X-Server is attached to the GPU, the device is a GeForge GTX-480 so I suppose that the compute capability is 2. Therefore, advanced features as dynamic parallelism and others, are disabled.

using the nvidia-smi utility there are various modes to setup the GPU

"Default" means multiple contexts are allowed per device.
"Exclusive Process" means only one context is allowed per device, usable from multiple threads at a time.
"Prohibited" means no contexts are allowed per device (no compute apps).

Which is the best mode to setup the GPU in order to obtain a benchmark as faithful as possible?

What is the command that I should use in order to make permanent such setup?

I am compiling the kernel using the following flags:

nvcc --ptxas-options=-v -O3   -w   -arch=sm_20 -use_fast_math -c -o

Exist a better combination of flags in order to obtain more help from the compiler to get faster execution times?

Any suggestion will be very appreciated.

Robert Crovella · Accepted Answer

my question is related to what is more appropriated? setup the GPU to a compute-exclusive mode or not.

It should not matter whether you set the GPU to exclusive-process or Default, as long as there is only one process attempting to use that GPU.

You generally would not want to use exclusive-thread except in specific situations, because exclusive-thread could prevent multi-threaded GPU apps from running correctly, and may also interfere with other functions such as profiler functions.

What is the command that I should use in order to make permanent such setup?

If you refer to the nvidia-smi command line help (nvidia-smi --help) or the nvidia-smi man page (man nvidia-smi), you can determine the command to make the change. Any changes you make will be permanent until they are explicitly changed again.

How to setup a dedicated GPU in order to benchmark a CUDA kernel?

Answers (1)

Related Questions