Reputation: 31
I am trying to train inception ResNetV2 model on my own dataset on Amazon's Deep Learning AMI
When I try to train on local machine the training starts as usual but when I try to train on aws instance it gets killed.
First I tried to train with MXNET backend . It gave the following error :
Notice that it gets killed.
So in
nano ~/.keras/keras.json
I tried to set image data format to channels_first :
{
"image_data_format": "channels_first",
"backend": "mxnet"
}
Then I got the error:
Traceback (most recent call last):
File "train.py", line 17, in <module>
model = applications.inception_resnet_v2.InceptionResNetV2(include_top=False, weights='imagenet', input_shape = (img_width, img_height, 3))
File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/keras_applications/inception_resnet_v2.py", line 243, in InceptionResNetV2
weights=weights)
File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/keras_applications/imagenet_utils.py", line 296, in _obtain_input_shape
'`input_shape=' + str(input_shape) + '`')
ValueError: The input must have 3 channels; got `input_shape=(182, 182, 3)`
Then I tried to switch to tensorflow backend to see how it plays out because there might be some misunderstanding on my part on how this process works. But when I switched to tensorflow backend and started training I got the following error :
As you can see it gets killed again. I am not sure what to do next. Some help would be great.
P.S I am sorry for the screenshots. You're going to have to zoom in a little to get a better view.
Upvotes: 0
Views: 398
Reputation: 3579
Deep Learning AMI was mostly not supported on t2 instance type. It should work on most of the good cpu instance type (like C4, C5) or GPU instance type (G3, P2 and P3) and many other instance type.
Upvotes: 1