depa981
depa981

Reputation: 11

GPU usage in opencl program

I have a question about the GPU usage when programming an opencl kernel: I have this program I wrote myself, the goal is to take an image and then apply a filter, the program works as I wanted, the filter is applied correctly but I don't know if the operation is done by the CPU or by the GPU because the task manager says GPU usage 0%. Is it possible that the program is so fast that the task manager does not detect the GPU usage (the image is not so big)? Are there any ways to check if the GPU is doing the job? Thanks

Upvotes: 1

Views: 1446

Answers (2)

ProjectPhysX
ProjectPhysX

Reputation: 5754

Windows 10 Task-Manager does not display GPU usage correctly for some OpenCL programs. In some instances you see the correct usage number in the "3D" tab, sometimes in the "Compute_0" tab, sometimes in the "Cuda" tab and sometimes not at all. When a tab other than the "3D" tab displays the correct percentage, on the left in the overview it still shows 0%. If the usage is displayed somewhat correctly depends on the driver version and even on the instructions you use in the OpenCL kernel. Windows Task-Manager estimates its GPU numbers from WDDM.

For more reliable readings (and also memory bandwidth usage, GPU temperature etc.), use nvidia-smi or rocm-smi, these tools are much more accurate.

Upvotes: 1

pmdj
pmdj

Reputation: 23438

You can control very precisely which device runs your kernels. When creating the OpenCL context, be sure to specify a specific device ID after enumerating them with an appropriate filter.

For example:

cl_device_id device_ids[5] = {};
cl_uint num = 0;
cl_int err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 5, device_ids, &num);
// This will limit to GPU devices only ----------^^^
// Don't forget to actually check & handle any error here.
//
// ... make sure at least 1 device was returned, get device information and
// use it to choose a device ...
cl_device_id device_id = device_ids[0]; // most trivial selection: just pick the first device

cl_context context = clCreateContext(NULL, 1, &device_id, report_cl_error, NULL, &err);
// any kernels run on this context will run on ^^this^^^ selected GPU

I have never encountered or heard of an OpenCL implementation which did not run the kernels on the selected device.

If whatever GPU usage indicator is showing 0%, the reason can either be what you suspected, or that the indicator itself is using some imperfect measure. For example, OpenCL usage might not count towards it.

Upvotes: 0

Related Questions