Reputation: 7583
While training a YoloV8 model, I get an error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 24.00 MiB. GPU 0 has a total capacty of 10.91 GiB of which 6.94 MiB is free. Including non-PyTorch memory, this process has 10.83 GiB memory in use. Of the allocated memory 10.51 GiB is allocated by PyTorch, and 39.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
This happens under Ubuntu 22.04 LTS, Python 3.11.5, pytorch 2.1.0, ultralytics 8.0.203, batch size=1, GPU NVIDIA 1080 Ti (11 GBytes).
If I run the model on CPU on the same machine, it works (slowly, of course). Under Windows 10 and 1650 GPU (4 GB) it works fine. Did you encounter such an issue? Any clue?
Upvotes: 1
Views: 2862
Reputation: 446
I had the same problem initially, but I was eventually able to fix it by setting cache=False
in the train()
method call.
Try this:
model = YOLO('yolov8x.pt')
results = model.train(
data='dataset.yaml',
epochs=250,
imgsz=640,
batch=-1,
device=0,
cache=False # cache=False is important!
)
To the question, why, I cannot give you a definitive answer, but I suspect that their caching algorithms cache all images on the GPU (which is obviously bad). By doing so, their auto-batching mechanism also doesn't work.
Upvotes: 3