installation tensorflow: cannot stat '/usr/include/cudnn.h'

I am trying to install tensorflow on my Jetson TX2 and am therefor following this tutorial from Jetsonhacks: https://www.youtube.com/watch?v=V51IO7kNXCg

When trying to execute ./setTensorflowEV.sh I get the following output:

~/installTensorFlowTX2$ ./setTensorFlowEV.sh 
mkdir: cannot create directory ‘/usr/lib/aarch64-linux-gnu/include/’: File exists
cp: cannot stat '/usr/include/cudnn.h': No such file or directory
You have bazel 0.5.2- installed.
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-    packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Invalid path to CUDA 8.0 toolkit. /usr/local/cuda/lib64/libcudart.so.8.0 cannot be found

The content of the file setTensorflowEV.sh: https://github.com/jetsonhacks/installTensorFlowTX2/blob/master/setTensorFlowEV.sh

I tried to locate cudnn.h on my system ($locate cudnn.h), but it isn't anywhere. I also looked at what I need to install in the shared object (sudo apt-file search libcudart.so.8.0), but that returned nothing neither.

So I would like to know what I can do to not have this error message any more.

Important note: I don't have physical access to the board to flash it or anything like that

I tried disabling cuda like so TF_CUDA_NEED=0

which gives:

~/installTensorFlowTX2$ ./setTensorFlowEV.sh 
mkdir: cannot create directory ‘/usr/lib/aarch64-linux-gnu/include/’: File exists
cp: cannot stat '/usr/include/cudnn.h': No such file or directory
You have bazel 0.5.2- installed.
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Configuration finished

but when trying to build Tensorflow I get:

~/installTensorFlowTX2$ ./buildTensorFlow.sh 
ERROR: /home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD:4:1: Traceback (most recent call last):
    File "/home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD", line 4
        error_gpu_disabled()
    File "/home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/error_gpu_disabled.bzl", line 3, in error_gpu_disabled
        fail("ERROR: Building with --config=c...")
ERROR: Building with --config=cuda but TensorFlow is not configured to build with GPU support. Please re-run ./configure and enter 'Y' at the prompt to build with GPU support.
ERROR: no such target '@local_config_cuda//crosstool:toolchain': target 'toolchain' not declared in package 'crosstool' defined by /home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/local_config_cuda/crosstool/BUILD.
INFO: Elapsed time: 0.403s

I don't have a ./configure script anywhere and set the line like so export TF_NEED_CUDA=0 in my ./buildTensorFlow.sh file:

#this is my modified buildTensorFlow.sh file
export TF_NEED_CUDA=0
export TF_CUDA_VERSION=8.0
export CUDA_TOOLKIT_PATH=/usr/local/cuda
export TF_CUDNN_VERSION=6.0.21
export CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu/
export TF_CUDA_COMPUTE_CAPABILITIES=6.2

# Build Tensorflow
cd $HOME/tensorflow
bazel build -c opt --local_resources 3072,4.0,1.0 --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package

Upvotes: 0

Views: 2966

Answers (1)

Matteo Ragni
Matteo Ragni

Reputation: 2956

DISCLAIMER for other readers: I cannot test this and I'm going to assume that the board has been previously flashed with an Nvidia L4T Ubuntu 16.04. If it is not, stop reading and good luck, but the board needs to be flashed with that one to run reliably and to be stable for an embedded application. Any diversion from that may cause any sort of unknown behavior.

The OP stated that the board has been flashed with a L4T 27.1 that refers to Nvidia JetPack 3.0, that you can download from the Nvidia Archives, here. To understand which version of JetPack you need for your L4T you can refer to this page.

Once JetPack is downloaded we need to unpack it and run one of its internal binary to create the repository json file.

bash ./JetPack-L4T-3.0-linux-x64.run --noexec
cd _installer
./Chooser

Chooser requires libpng12 to be installed on your host (at least). If you check in the directory it has generated repository.json that we need to inspect. From that file appears that NVIDIA is providing the same packages for TX1 and TX", so we need to focus on the TX1 packages.

By inspecting the json it appears:

You must download both packages on your board using ssh (wget http...).

The first one you should install is the cuda repository:

 sudo dpkg -i cuda-repo*.deb

This will make a lot of packages available locally, such as libcudart that you need to install:

 sudo apt update
 sudo apt install cuda-toolkit-8.0 # (this may be enough)

There are other packages that may need installation (use ls /var/cuda* to list them all).

For the installation of cudnn you must unzip the previous file in a temporary directory:

unzip cuDNN-....zip
cd cuDNN

there are three deb files to be installed

sudo dpkg -i *.deb

that should install all the file needed in the correct directory. At this point you should try to restart the compilation process. But before that I would change this line with a version 5.1.x (in this case is 5.1.5).

Upvotes: 1

Related Questions