Reputation: 377
I installed TensorFlow nightly build version via the command
pip install tf-nightly-gpu --prefix=/tf/install/path
When I tried to run any XLA example, TensorFlow has error "Unable to find libdevice dir. Using '.' Failed to compile ptx to cubin. Will attempt to let GPU driver compile the ptx. Not found: /usr/local/cuda-10.0/bin/ptxas not found".
So apparently TensorFlow cannot find my CUDA path. In my system, the CUDA is installed in /cm/shared/apps/cuda/toolkit/10.0.130. Since I didn't build TensorFlow from source, by default XLA searches the folder /user/local/cuda-*. But since I do not have this folder, it will issue an error.
Currently my workaround is to create a symbolic link. I checked the TensorFlow source code in tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc. There is a comment in the file "// CUDA location explicitly specified by user via --xla_gpu_cuda_data_dir has highest priority." So how to pass values to this flag? I tried the following two environment variables, but neither of them works:
export XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda10.0/toolkit/10.0.130/"
export TF_XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda10.0/toolkit/10.0.130/"
So how to use the flag "--xla_gpu_cuda_data_dir"? Thanks.
Upvotes: 6
Views: 17936
Reputation: 1545
This worked for me.
tensorflow 2.11.0 gpu_py310hf8ff8df_0
ii nvidia-dkms-525 525.105.17-0ubuntu0.22.04.1 amd64 NVIDIA DKMS package
ii nvidia-driver-525 525.105.17-0ubuntu0.22.04.1 amd64 NVIDIA driver metapackage
nvidia-cuda-toolkit not installed
nVidia T4 @GCE Ubu 22.04LTS min
conda install -c nvidia cuda-nvcc
ln -s /path/to/conda-env/lib/libdevice.10.bc .
I couldn't get the XLA_FLAGS to work
2023-04-21 09:17:00.947644: F tensorflow/compiler/xla/parse_flags_from_env.cc:226] Unknown flags in XLA_FLAGS: -–xla_gpu_cuda_data_dir=/home/rac/fulltf2/fullcuda.env/lib
Perhaps you meant to specify these on the TF_XLA_FLAGS envvar?
Aborted (core dumped)
Upvotes: 2
Reputation: 111
you can run export XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda
in terminal
Upvotes: 11
Reputation: 341
There is a code change for this issue, but not clear how to use. Check here https://github.com/tensorflow/tensorflow/issues/23783
Upvotes: 1