Reputation: 7067
I just started to explore AI and never used Tensorflow, even Linux is new to me.
I have previously installed NVIDIA Driver 430. It comes with CUDA 10.1
Since Tensorflow-gpu 1.14 doesn't support CUDA 10.1, I uninstalled CUDA 10.1 and I downloaded CUDA 10.0
cuda_10.0.130_410.48_linux.run
once installed I ran
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
When I tried to use GPU in Jupyter Notebook, the code still doesn't work
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))
Error:
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1355 try:
-> 1356 return fn(*args)
1357 except errors.OpError as e:
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1338 # Ensure any changes to the graph are reflected in the runtime.
-> 1339 self._extend_graph()
1340 return self._call_tf_sessionrun(
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _extend_graph(self)
1373 with self._graph._session_run_lock(): # pylint: disable=protected-access
-> 1374 tf_session.ExtendSession(self._session)
1375
InvalidArgumentError: Cannot assign a device for operation MatMul: {{node MatMul}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[MatMul]]
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
<ipython-input-19-3a5be606bcc9> in <module>
6
7 with tf.Session() as sess:
----> 8 print (sess.run(c))
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
948 try:
949 result = self._run(None, fetches, feed_dict, options_ptr,
--> 950 run_metadata_ptr)
951 if run_metadata:
952 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1171 if final_fetches or final_targets or (handle and feed_dict_tensor):
1172 results = self._do_run(handle, final_targets, final_fetches,
-> 1173 feed_dict_tensor, options, run_metadata)
1174 else:
1175 results = []
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1348 if handle is None:
1349 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1350 run_metadata)
1351 else:
1352 return self._do_call(_prun_fn, handle, feeds, fetches)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1368 pass
1369 message = error_interpolation.interpolate(message, self._graph)
-> 1370 raise type(e)(node_def, op, message)
1371
1372 def _extend_graph(self):
InvalidArgumentError: Cannot assign a device for operation MatMul: node MatMul (defined at <ipython-input-9-b145a02709f7>:5) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[MatMul]]
Errors may have originated from an input operation.
Input Source operations connected to node MatMul:
b (defined at <ipython-input-9-b145a02709f7>:4)
a (defined at <ipython-input-9-b145a02709f7>:3)
But, if I ran this code from Terminal in Python, it works. I can see the output
[[22. 28.] [49. 64.]]
Upvotes: 3
Views: 7959
Reputation: 1924
You need to make sure you have the appropriate CUDA
AND CuDNN
versions installed.
CuDNN
version with the advice from this link: How to verify CuDNN installation?
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
on a linux machineCUDA
version here: xcat.docs
nvcc -V
nvidia-smi
xla_gpu
s here: tensorflow xla and here: github xla_gpu issue
CUDA
without CuDNN
calls gpu
s xla_gpu
s. Nvidia gpus need CUDA and CuDNN to work properly with Tensorflow, so it looks like tensorflow is trying to use its own library to compute on the GPU. But, I'm not really sure.Upvotes: 4