Reputation: 1065
I have recently installed tensorflow-gpu using pip. But when I am importing it it is giving the following error:
ImportError: libcudnn.so.7: cannot open shared object file: No such file or directory
I have gone through all the answers of stackoverflow related to this issue but none of them worked for me.
libcudnn.so.7 is present in both the following directories /usr/local/cuda/lib64 and /usr/local/cuda-9.0/lib64 .
Also, I have added the following path in my .bashrc file:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Please help me in resolving this
Upvotes: 22
Views: 35845
Reputation: 37
I had a libcudnn.so.7
error while trying to install MXNet on Google Colab. What ultimately solved it for me was installing libcudnn7 after some other steps; here's all the major stuff I performed. I hope this helps anyone else who is slogging through this kind of mess like I did.
My specific need was to downgrade Cuda in Google Colab; at the time of writing this it comes with 11.8 but MXNet only supports older versions. I was following this tutorial: https://aconcaguasci.blogspot.com/2019/12/setting-up-cuda-100-for-mxnet-on-google.html
I followed the majority of it including:
#Uninstall the current CUDA version !apt-get --purge remove cuda nvidia* libnvidia-* !dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 dpkg --purge !apt-get remove cuda-* !apt autoremove !apt-get update
#Download CUDA 10.0 !wget --no-clobber https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
#install CUDA kit dpkg # Note: I piped yes to answer the config file prompt with installing new version !yes | dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb !sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub !apt-get update !apt-get install cuda-10-0
# Although I did not encounter a `libcurand.so.10` error yet, I still ran this part too: #Solve libcurand.so.10 error !wget --no-clobber http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb #-nc, --no-clobber: skip downloads that would download to existing files. !apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb !apt-get update
HERE was the point where I could pip install mxnet-cu100
and it would fail on import mxnet as mx
with "OSError: libcudnn.so.7: cannot open shared object file: No such file or directory".
!find / -iname libcudnn
would only return two folders, /var/lib/dpkg/alternatives/libcudnn
and /etc/alternatives/libcudnn
.
For example, for Cuda 9.0 and cuDNN 7.4.1:
$ sudo apt-get install libcudnn7=7.4.1.5-1+cuda9.0 sudo apt-get install libcudnn7-devel=7.4.1.5-1+cuda9.0
I swapped the Cuda version cuda9.0
for cuda10.0
and ran:
!sudo apt-get install libcudnn7=7.4.1.5-1+cuda10.0
I did not/could not run the libcudnn7-devel because it was "Unable to locate package"
After this, I could pip install mxnet-cu100==1.9.0
(MXNet for Cuda 10.0). And of course nvcc --version
would report Cuda 10.0. I was finally able to run import mxnet as mx
without getting any "cannot open shared object file: ..." errors.
I validated it successfully with:
import mxnet as mx
print(mx.context.num_gpus())
a = mx.nd.ones((2, 3), mx.gpu())
b = a * 2 + 1
print(b.asnumpy())
Outputting:
1
[[3. 3. 3.]
[3. 3. 3.]]
I realize this was with MXNet and not TensorFlow, but it was a libcudnn.so.7
error and I hope it helps anyone else coming across this, at least with Google Colab. I could not find much support for that recently, hence why I followed that tutorial I mentioned at the top.
Upvotes: 0
Reputation: 503
The reason is that some libraries are missing. Try installing
sudo apt install libcudnn7
Upvotes: -2
Reputation: 10058
FWIW is interested I created a shell script which installs different CUDA versions in Debian which can be easily ported to Ubuntu:
Upvotes: 0
Reputation: 51
Reinstalling CudNN-7.0.5, (make sure you pick the right version from the link below) fixed this for me. You'll need to log in to your Nvidia developer account to access the link. (If you don't have an Nvidia account, creating one is straight forward);
https://developer.nvidia.com/rdp/cudnn-archive
Installation instructions for CudNN; https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
But I also encountered the following error;
Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Therefore, I had to once again download and install the right CuDNN version, i used the information from the above error message and installed CuDNN 7.4.2 and this fixed all the errors and everything worked fine.
Good Luck!
Upvotes: 5
Reputation: 412
you add the following path in your .bashrc file:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Upvotes: 0
Reputation: 1249
You might need to download and install NVIDIA cuDNN.
Download it from https://developer.nvidia.com/rdp/cudnn-download (You have to register an account to download if you don't have). The runtime version is usually more stable than the developer version.
Upvotes: 6
Reputation: 56377
You are setting LD_LIBRARY_PATH in the wrong way, I would recommend to do it this way (which is kind of the standard):
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Upvotes: 15