Reputation: 1
I’m training a PatchCore model with an image size of 128x512 on a GPU with 23.67 GiB memory. However, I’m encountering the following error:
CUDA Version: 12.4
PyTorch Version: 2.5.1
OutOfMemoryError: CUDA out of memory. Tried to allocate 2.17 GiB. GPU 0 has a total capacity of 23.67 GiB of which 47.88 MiB is free. Including non-PyTorch memory, this process has 23.62 GiB memory in use. Of the allocated memory 23.29 GiB is allocated by PyTorch, and 15.45 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management.
yaml
):data:
class_path: anomalib.data.Folder
init_args:
name: train_data
root: ""
image_size:
- 128
- 512
normal_dir: ""
abnormal_dir: ""
normal_test_dir: ""
mask_dir: ""
normal_split_ratio: 0
extensions: [".png"]
train_batch_size: 4
eval_batch_size: 4
num_workers: 8
train_transform:
class_path: torchvision.transforms.v2.Compose
init_args:
transforms:
- class_path: torchvision.transforms.v2.RandomAdjustSharpness
init_args:
sharpness_factor: 0.7
p: 0.5
- class_path: torchvision.transforms.v2.RandomHorizontalFlip
init_args:
p: 0.5
- class_path: torchvision.transforms.v2.Resize
init_args:
size: [128, 512]
- class_path: torchvision.transforms.v2.Normalize
init_args:
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
eval_transform:
class_path: torchvision.transforms.v2.Compose
init_args:
transforms:
- class_path: torchvision.transforms.v2.Resize
init_args:
size: [128, 512]
- class_path: torchvision.transforms.v2.Normalize
init_args:
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
model:
class_path: anomalib.models.Patchcore
init_args:
backbone: wide_resnet50_2
layers:
- layer2
- layer3
pre_trained: true
coreset_sampling_ratio: 0.1
num_neighbors: 9
Steps I’ve Tried:
Lowering the batch size: I reduced the batch size to as low as 1, but the issue persists.
Checking for memory fragmentation: Followed the suggestion in the error to set PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True. However, this did not solve the problem.
Ensuring no memory leakage: Verified that no other processes are consuming GPU memory using nvidia-smi, but the allocated memory remains maxed out during training.
Questions:
Are there specific optimizations for PatchCore or PyTorch that can help reduce memory usage?
Upvotes: 0
Views: 25
Reputation: 189
Have you tried using mixed precision?
You can usually set it using precision="16-mixed"
in a Lightning trainer. anomalib
seem to have implemented a way to use it during deployment.
Upvotes: 0