Reputation: 1
I am running the following MATLAB code on a system with one GTX 1080 and a K80 (with 2 GPUs)
delete(gcp('nocreate'));
parpool('local',2);
spmd
gpuDevice(labindex+1)
end
reset(gpuDevice(2))
reset(gpuDevice(3))
parfor i=1:100
SingleGPUMatlabCode(i);
end
The code runs for around a second. When I rerun the code after few seconds. I get the message:
Error using parallel.gpu.CUDADevice/reset
An unexpected error occurred during CUDA execution. The
CUDA error was:
unknown error
Error in CreateDictionary
reset(gpuDevice(2))
I tried increasing TdrDelay
, but it did not help.
Upvotes: 0
Views: 170
Reputation: 322
Something in your GPU code is causing an error on the device. Because the code is running asynchronously, this error is not picked up until the next synchronisation point, which is when you run the code again. I would need to see the contents of SingleGPUMatlabCode
to know what that error might be. Perhaps there's an allocation failure or an out of bounds access. Errors that aren't correctly handled will get converted to 'unknown error' at the next CUDA operation.
Try adding wait(gpuDevice)
inside the loop to identify when the error is occurring.
If either device 2 or 3 are the GTX1080, you may have discovered an issue with MATLAB's restricted support for the Pascal architecture. See https://www.mathworks.com/matlabcentral/answers/309235-can-i-use-my-nvidia-pascal-architecture-gpu-with-matlab-for-gpu-computing
If this is caused by the Windows timeout, you would see a several second screen blackout.
Upvotes: 1