DocDriven
DocDriven

Reputation: 3974

Best practice for upgrading CUDA and cuDNN for tensorflow

I'm currently in charge of getting tensorflow-gpu 1.8 to work on my machine. I've been using tf-gpu 1.2 until now, but due to some required features, I have to upgrade my installation.

Before doing so, I wanted to check if there is a best practice to do this. My current setup looks like this:

As written on the tf-homepage, I would have to use CUDA v9.0 as well as cuDNN v7.1. As all these instructions refer to a clean install and not an update, I'm not sure if it would be best to uninstall the old versions first.

Please share your experiences if you have already had the same issue. Thank you!

Upvotes: 20

Views: 65327

Answers (2)

DocDriven
DocDriven

Reputation: 3974

Thanks @joão gabriel s.f. I was able to successfully deinstall CUDA 8.0/cuDNN 5.1 and install the latest version of tensorflow. As the whole procedure was a little confusing to me, I decided to post a quick walkthrough and maybe help people in the same situation.

UNINSTALL

First, I uninstalled cuda and all its dependencies. As I installed it via package manager, I used apt-get to remove it. For runfile installations, you can check this.

sudo apt-get --purge remove cuda
sudo apt-get autoremove
dpkg --list |grep "^rc" | cut -d " " -f 3 | xargs sudo dpkg --purge

Also, I checked for any cuda folders in /usr/local/ and removed them. Regarding cuDNN, through the removal of all cuda folders, the corresponding cuda headers and libs have been deleted.

INSTALL

Check the driver of the graphics card first. CUDA 9.0 works with the v384.111 driver (so no 390.xxx needed), so I had nothing to do here.

I downloaded CUDA Toolkit 9.0 here as deb (local). In the same folder, I executed

dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

Then set the environment variables:

export PATH=${PATH}:/usr/local/cuda-9.0/bin
export CUDA_HOME=${CUDA_HOME}:/usr/local/cuda:/usr/local/cuda-9.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-9.0/lib64

Afterwards I verified my installation as described here.

I downloaded cuDNN 7.1 from the archive as tarball and installed it via

tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h \ 
/usr/local/cuda/lib64/libcudnn*

After starting the Python bash, I was able to import tensorflow and run a simple graph.

Thanks again and have a nice week!

Upvotes: 22

joão gabriel s.f.
joão gabriel s.f.

Reputation: 366

See this documentation. They say to always remove the old version from cuda first.

and since cuda 9.1 requires a driver >= 390 version (check compatibility chart). It would be good to remove your current driver. But no worries, because the 390 driver comes with cuda 9.1 at install.

Now, as a personal advice, i would say to remove almost everything ( excluding python) related to nvidia / cuda. For some reasons is pretty easy to mess it up when installing and setting up CUDA in Ubuntu.

If you have any problems after the install, see ubuntu-16-04-lts-login-loop-after-updating-driver-nvidia, it's a post wich I answered a time ago.

Upvotes: 7

Related Questions