Reputation: 571
I am trying to install support for Tensorflow GPU using the following guide:
https://www.tensorflow.org/install/gpu
I am on Ubuntu (20.04 LTS)
I've followed the instruction for the latest Ubuntu below (Cuda 11):
# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt install ./libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt-get update
# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-11-0 \
libcudnn8=8.0.4.30-1+cuda11.0 \
libcudnn8-dev=8.0.4.30-1+cuda11.0
# Reboot. Check that GPUs are visible using the command: nvidia-smi
# Install TensorRT. Requires that libcudnn8 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer7=7.1.3-1+cuda11.0 \
libnvinfer-dev=7.1.3-1+cuda11.0 \
libnvinfer-plugin7=7.1.3-1+cuda11.0
After running this and rebooting, I have Cuda 11 and CuDNN 8.
After this I installed tensorflow with a simple pip install tensorflow
, as I understood online there's no need to install tensorflow-gpu
explicitly in the newer versions of tensorflow.
This is what I'm getting after trying to import tensorflow and check physical devices:
import tensorflow as tf
Result:
2021-06-02 16:04:03.347039: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
tf.config.list_physical_devies('GPU')
Result:
2021-06-02 16:11:19.035743: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-06-02 16:11:19.067500: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-02 16:11:19.067753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.759GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2021-06-02 16:11:19.067771: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-02 16:11:19.069485: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-06-02 16:11:19.069529: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-06-02 16:11:19.069625: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069689: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069736: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069796: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069930: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-06-02 16:11:19.069938: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]
It seems like tensorflow is complaining about 4 files (.so libraries):
I've tried to look for these in my system using the locate
command on Ubuntu, they do not exist anywhere.
I haven't added anything to my .bashrc since I was not sure what the LD_LIBRARY_PATH must be.
Upvotes: 1
Views: 1329
Reputation: 449
Try to install the missing libraries and check
apt-get install -y cuda-command-line-tools-11-4 libcublas-11-4 libcufft-11-4 libcurand-11-4 libcusolver-11-4 libcusparse-11-4
Note that the version should be based on your CUDA version.
${CUDA/./-}
If your CUDA version is 11.2 then the library version will be libcufft-11-2
Upvotes: 1
Reputation: 19
Put this library path in the ~/.bashrc file and source then try
export PATH=/usr/local/cuda-11.0/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:${LD_LIBRARY_PATH}
export CUDA_HOME=/usr/local/cuda
Please change your path according to your setup.
Upvotes: 1