Reputation: 291
I'm implementing a complex algorithm in OpenCL but I'm facing an issue on NVidia hardware. When my algorithm is called multiple times in parallel the memory on the NVidia GPU is not enough and execution of random threads might stop with MemoryAllocation errors (I tried to explain this on https://devtalk.nvidia.com/default/topic/1019997/cuda-programming-and-performance/how-to-handle-cl_mem_object_allocation_failure-errors-if-amount-of-useable-memory-is-not-known-/ before.)
My current solution is to request the available memory on the GPU and only allow execution if there is enough. The problem is reading out the available memory.
I use
#define GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX 0x9049
glGetIntegerv(GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX,
¤tlyAvailableMemoryInKb);
to read out the memory for which I create a hidden window with an OpenGL context. The problem is now that when the NVidia gpu is not the main GPU I have to select the card using the gpu affinity extension: https://www.khronos.org/registry/OpenGL/extensions/NV/WGL_NV_gpu_affinity.txt but wglGetProcAddress returns null. I think this is because I'm booting with the Intel GPU as main device. (Loading other extensions like wglCreatePbufferARB is not a problem.)
Is there a way to handle this and forward the wglGetProcAddress call to another gpu/driver?
Thanks in advance! Best Regards Michael
Ps.: I also tried using the cuda runtime to get the available memory. This did not work out. The opencl driver was unreliable (caused some deadlocks after using some cudart features) and the returned value was incorrect.
Upvotes: 0
Views: 352
Reputation: 291
I found another solution. I did not know about NVAPI before but this library solved the problem.
I use OpenCL to get the PCI ID of the selected NVidia card:
#define CL_DEVICE_PCI_BUS_ID_NV 0x4008
cl_int busId = 0;
device.getInfo(CL_DEVICE_PCI_BUS_ID_NV, &busId);
Then I use NvAPI_EnumPhysicalGPUs to enumerate the nvidia gpus. With NvAPI_GPU_GetBusId I can get the pci bus id of the devices returned by the previous function. If the bus ids are equal I call NvAPI_GPU_GetMemoryInfo to get the currently available amount of memory.
So far this solves all the issues I had. Meaning I can drop opengl and the ugly hack to open an invisible window.
Upvotes: 3