arturojain
arturojain

Reputation: 167

Getting Out of Memory error when using Image Classification in Sage Maker

When using a p2.xlarge or p3.2xlarge with up to 1TB of memory trying to use the predefined SageMaker Image Classification algorithm in a training job I’m getting the following error:

ClientError: Out of Memory. Please use a larger instance and/or reduce the values of other parameters (e.g. batch size, number of layers etc.) if applicable

I’m using 450+ images, I’ve tried resizing them from their original 2000x3000px size to a 244x244px size down to a 24x24px size and keep getting the same error.

I’ve tried adjusting my hyper parameters: num_classes, num_layers, num_training_samples, optimizer, image_shape, checkpoint frequency, batch_size and epochs. Also tried using pretrained model. But the same error keeps occurring.

Upvotes: 1

Views: 4365

Answers (1)

Nick Walsh
Nick Walsh

Reputation: 1875

Would've added this as a comment but I don't have enough rep yet.

A few clarifying questions so that I can have some more context:

How exactly are you achieving 1TB of RAM?

  1. p2.xlarge servers have 61GB of RAM, and p3.2xlarge servers have 61GB memory + 16GB onboard the Tesla V100 GPU.

How are you storing, resizing, and ingesting the images into the SageMaker algorithm?

  1. The memory error seems suspect considering it still occurs when downsizing images to 24x24. If you are resizing your original images (450 images at 2000x3000 resolution) as in-memory objects and aren't performing the transformations in-place (ie: not creating new images), you may have a substantial bit of memory pre-allocated, causing the SageMaker training algorithm to throw an OOM error.

Upvotes: 2

Related Questions