Reputation: 191
I have run the model with LSTM as the first layer successfully. But out of curiosity, I replace LSTM with CuDNNLSTM. But after model.fit, it replied the following error message:
UnknownError: Fail to find the dnn implementation.
[[{{node cu_dnnlstm_5/CudnnRNN}} = CudnnRNN[T=DT_FLOAT, _class=["loc:@training_2/Adam/gradients/cu_dnnlstm_5/CudnnRNN_grad/CudnnRNNBackprop"], direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="lstm", seed=87654321, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnnlstm_5/transpose, cu_dnnlstm_5/ExpandDims_1, cu_dnnlstm_5/ExpandDims_1, cu_dnnlstm_5/concat_1)]]
[[{{node metrics_3/mean_squared_error/Mean_1/_1877}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4852_metrics_3/mean_squared_error/Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I have tried TestCudnnLSTM() on this discussion and pass the test successfully:
Keras version: 2.2.4 Tensorflow version: 1.12.0 Creating Model _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= cu_dnnlstm_1 (CuDNNLSTM) (None, 1000, 1) 16 ================================================================= Total params: 16 Trainable params: 16 Non-trainable params: 0 _________________________________________________________________ None Model compiled
It seems that the problem appears during model fitting. But I don't know exactly what is the problem?
Upvotes: 18
Views: 26661
Reputation: 66
For me, the issue was resolved after installing the correct version of Tensorflow for the installed CUDA version. The correct matches can be seen from here: https://www.tensorflow.org/install/source#gpu. To check CUDA version installed on machine, use nvcc --version
.
Upvotes: 0
Reputation: 1
My code worked after I check all the versions of the following packages: cuda, cudnn, tensorflow and gcc. You need to find the corresponding version for all, hope it helps!
Mine version is below:
Upvotes: 0
Reputation: 136
Also check that the cuDNN is present for the CUDA version your application uses.
Upgrading tensorflow can cause it using another CUDA version
For instance tensorflow-2.3 uses CUDA 10.1 but tensorflow-2.5 uses 11.2
I got the same error in Windows and I had to copy the latest cuDNN DLL's into the "c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2" folder
Upvotes: 0
Reputation: 41
I would recommend checking if any other kernel has imported tensorflow or keras. If yes, shutdown that kernel - even if it is not busy. It solved the problem in my case.
Upvotes: 1
Reputation: 15
I Installed tensorflow and keras using conda in the Virtual env and this solved it.
conda install tensorflow
conda install keras
Upvotes: 0
Reputation: 1348
For TensorFlow v2, one solution would be -
import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)
Then you can use keras model too -
from tensorflow.keras.models import Model
This solution worked for me, it enables memory growth for only one GPU.
Upvotes: 41
Reputation: 86
In tensorflow 2.0 i got the same error while running RNN LSTM model.The reason was due to lower version of my cuDNN.In the tensorflow gpu requirements page it was recommended to have
cuDNN SDK >= 7.4.1.
You can refer for more details in https://www.tensorflow.org/install/gpu
Asked in Tensorflow Reddit forum
Upvotes: 1
Reputation: 129
If you're getting this error while fitting Keras NN put this code on your import
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
set_session(sess)
Upvotes: 6
Reputation: 11
I had the same issue , when I updated tensorflow to 1.12. Error got resolved after updating my CuDNN verstion to 7.5 from 7. I followed the steps mentioned in the below url for updating the CuDNN version (Note: The steps mentioned in the link are for installing CUDNN , but the same is applicable for update as well)
https://jhui.github.io/2017/09/07/AWS-P2-CUDA-CuDNN-TensorFlow/
Upvotes: 1
Reputation: 476
Make sure you have the proper Nvidia driver version for the version of CUDA you are using. You can check it out here. https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility
I'm using CUDA 9.0, but was using Nvidia driver less than 384.81. Updating the Nvidia driver to a newer one fixed the problem for me.
Upvotes: 1