Reputation: 823
I would like to train a neural network whilst utilising all 4 GPU's on my g2.8xarge EC2 instance using MXNet. I am using the following AWS Deep Learning Linux community AMI:
Deep Learning AMI Amazon Linux - 3.3_Oct2017 - ami-999844e0)
As per these instructions, when I connect to the instance I switch to keras v1 with the MXNet backend by issuing this command:
source ~/src/anaconda3/bin/activate keras1.2_p2
I have also added the context flag to my python model compile code to utilise the GPU's in MXNet:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'], context=gpu_list)
where gpu_list is meant to utilise all 4 GPU's.
However every time I run my code, I get this error message:
Epoch 1/300 [15:09:52] /home/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308: [15:09:52] src/storage/storage.cc:113: Compile with USE_CUDA=1 to enable GPU usage
and
RuntimeError: simple_bind error. Arguments: dense_input_1: (25, 34L) [15:09:52] src/storage/storage.cc:113: Compile with USE_CUDA=1 to enable GPU usage
I have checked the config.mk file in /home/ec2-user/src/mxnet and it contains USE_CUDA=1. I have also issued the 'made' command to try and recompile MXNet with the USE_CUDA=1 flag - no change.
Am I having this issue as I'm using the virtual environment the AWS documentation says to use? Has anyone else had this issue with MXNet on the AWS Deep Learning Ubuntu AMI using this virtual env?
Any suggestions greatly appreciated -
Upvotes: 2
Views: 10447
Reputation: 111
This is because the Keras Conda environment has a dependency on mxnet cpu pip package. You can install the gpu version inside the Conda environment with:
pip install mxnet-cu80
Upvotes: 6