Reputation: 31
Like many othersm I'm getting a Runtime error of Cuda out of memory, but for some reason pytorch has reserved a large amount of it.
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 6.00 GiB total capacity; 4.31 GiB already allocated; 844.80 KiB free; 4.71 GiB reserved in total by PyTorch)
I've tried the torch.cuda.empy_cache(), but this isn't working either and none of the other CUDA out of memory posts have helped me either.
When I've checked my gpu usage(nvidia-smi) before running my python program, it is plenty free.
Upvotes: 3
Views: 16918
Reputation: 1187
You should find processes by typing,
!ps -aux|grep python
and then when you find the process you should kill it by typing,
!kill -9 652
This will kill process number 652. In your case this will be something else that you want to get rid of.
NOTE: Remember you have to start over your code if you kill some process that you should not have ended. But this is the most easiest and manual way to do it.
Another Note: Or you can always decrease the batch size, if the problem happens again after successfully emptying the gpu cache.
Upvotes: 0
Reputation: 96
From the given description it seems that the problem is not allocated memory by Pytorch so far before the execution but cuda ran out of memory while allocating the data that means the 4.31GB got already allocated (not cached) but failed to allocate the 2MB last block. Possible solution already worked for me, is to decrease the batch size, hope that helps!
Upvotes: 2