Reputation: 1012
I'm running the tutorial example for XLA using a TensorFlow compiled from source. Running python mnist_softmax_xla.py
results in the following error:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
I tensorflow/core/platform/default/cuda_libdevice_path.cc:35] TEST_SRCDIR environment variable not set: using local_config_cuda/cuda under this executable's runfiles directory as the CUDA root.
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): Tesla K80, Compute Capability 3.7
W tensorflow/core/framework/op_kernel.cc:993] Not found: ./libdevice.compute_35.10.bc not found
Traceback (most recent call last):
File "/mnt/software/envs/xla/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
return fn(*args)
File "/mnt/software/envs/xla/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
status, run_metadata)
File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
next(self.gen)
File "/mnt/software/envs/xla/lib/python3.4/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: ./libdevice.compute_35.10.bc not found
[[Node: cluster_0/_0/_1 = _XlaLaunch[Targs=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], Tconstants=[DT_INT32], Tresults=[DT_FLOAT, DT_FLOAT], function=cluster_0[_XlaCompiledKernel=true, _XlaNumConstantArgs=1], _device="/job:localhost/replica:0/task:0/gpu:0"](Shape_2, _recv_Placeholder_0/_3, _recv_Placeholder_1_0/_1, Variable_1, Variable)]]
I have CUDA 8 installed with cuDNN 5.1. The file libdevice.compute_35.10.bc
does exist on the machine:
$ find /usr/local/cuda/ -type f | grep libdevice.compute_35.10.bc
/usr/local/cuda/nvvm/libdevice/libdevice.compute_35.10.bc
My hunch is that this has something to do with the message TEST_SRCDIR environment variable not set: using local_config_cuda/cuda under this executable's runfiles directory as the CUDA root.
, but I'm not sure what to do about it.
Upvotes: 1
Views: 4170
Reputation: 191
https://github.com/tensorflow/tensorflow/pull/7079 should be able to fix this. Thanks for the bug report!
Upvotes: 1
Reputation: 4147
The key is this log message:
I tensorflow/core/platform/default/cuda_libdevice_path.cc:35] TEST_SRCDIR environment variable not set: using local_config_cuda/cuda under this executable's runfiles directory as the CUDA root.
(I only noticed it in the logs later; I actually found that file by digging around in the sources and only then noticed the message in the logs.)
For reasons that I do not currently understand, XLA does not look in /usr/local/cuda (or whatever directory you gave when you ran ./configure
) for libdevice. Per cuda_libdevice_path.cc [1], it's looking for a symlink that was created specifically to point it to libdevice.
I'm going to loop in the person who wrote this code to figure out what it's supposed to be doing. In the meantime, I was able to work around it myself as follows:
$ mkdir local_config_cuda
$ ln -s /usr/local/cuda local_config_cuda/cuda
$ TEST_SRCDIR=$(pwd) python my_program.py
The important thing is to set TEST_SRCDIR
to the parent of the local_config_cuda directory.
Sorry for the trouble, and sorry I don't have a less-hacky answer for you right now.
[1] https://github.com/tensorflow/tensorflow/blob/e1f44d8/tensorflow/core/platform/cuda_libdevice_path.cc#L23 https://github.com/tensorflow/tensorflow/blob/1084748efa3234c7daa824718aeb7df7b9252def/tensorflow/core/platform/default/cuda_libdevice_path.cc#L27
Upvotes: 2