Akshay Gupta
Akshay Gupta

Reputation: 31

Training model on AWS Deep Learning AMI instance - gets 'killed' with warnings

I am trying to train inception ResNetV2 model on my own dataset on Amazon's Deep Learning AMI

When I try to train on local machine the training starts as usual but when I try to train on aws instance it gets killed.

First I tried to train with MXNET backend . It gave the following error :

enter image description here

Notice that it gets killed.

So in

nano ~/.keras/keras.json

I tried to set image data format to channels_first :

{
    "image_data_format": "channels_first", 
    "backend": "mxnet"
}

Then I got the error:

Traceback (most recent call last):
    File "train.py", line 17, in <module>
        model = applications.inception_resnet_v2.InceptionResNetV2(include_top=False, weights='imagenet', input_shape = (img_width, img_height, 3))
    File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/keras_applications/inception_resnet_v2.py", line 243, in InceptionResNetV2
weights=weights)
    File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/keras_applications/imagenet_utils.py", line 296, in _obtain_input_shape
'`input_shape=' + str(input_shape) + '`')
ValueError: The input must have 3 channels; got `input_shape=(182, 182, 3)`

Then I tried to switch to tensorflow backend to see how it plays out because there might be some misunderstanding on my part on how this process works. But when I switched to tensorflow backend and started training I got the following error :

enter image description here

As you can see it gets killed again. I am not sure what to do next. Some help would be great.

P.S I am sorry for the screenshots. You're going to have to zoom in a little to get a better view.

Upvotes: 0

Views: 398

Answers (1)

rgaut
rgaut

Reputation: 3579

Deep Learning AMI was mostly not supported on t2 instance type. It should work on most of the good cpu instance type (like C4, C5) or GPU instance type (G3, P2 and P3) and many other instance type.

Upvotes: 1

Related Questions