Reputation: 7077
I just started to explore AI and never used Tensorflow, even Linux is new to me.
I have previously installed NVIDIA Driver 430. It comes with CUDA 10.1
Since Tensorflow-gpu 1.14 doesn't support CUDA 10.1, I uninstalled CUDA 10.1 and I downloaded CUDA 10.0
once installed I ran
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
When I tried to use GPU in Jupyter Notebook, the code still doesn't work
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (
InvalidArgumentError Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _do_call(self, fn, *args)
1355 try:
-> 1356 return fn(*args)
1357 except errors.OpError as e:
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1338 # Ensure any changes to the graph are reflected in the runtime.
-> 1339 self._extend_graph()
1340 return self._call_tf_sessionrun(
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _extend_graph(self)
1373 with self._graph._session_run_lock(): # pylint: disable=protected-access
-> 1374 tf_session.ExtendSession(self._session)
InvalidArgumentError: Cannot assign a device for operation MatMul: {{node MatMul}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
<ipython-input-19-3a5be606bcc9> in <module>
7 with tf.Session() as sess:
----> 8 print (
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in run(self, fetches, feed_dict, options, run_metadata)
948 try:
949 result = self._run(None, fetches, feed_dict, options_ptr,
--> 950 run_metadata_ptr)
951 if run_metadata:
952 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _run(self, handle, fetches, feed_dict, options, run_metadata)
1171 if final_fetches or final_targets or (handle and feed_dict_tensor):
1172 results = self._do_run(handle, final_targets, final_fetches,
-> 1173 feed_dict_tensor, options, run_metadata)
1174 else:
1175 results = []
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1348 if handle is None:
1349 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1350 run_metadata)
1351 else:
1352 return self._do_call(_prun_fn, handle, feeds, fetches)
~/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/ in _do_call(self, fn, *args)
1368 pass
1369 message = error_interpolation.interpolate(message, self._graph)
-> 1370 raise type(e)(node_def, op, message)
1372 def _extend_graph(self):
InvalidArgumentError: Cannot assign a device for operation MatMul: node MatMul (defined at <ipython-input-9-b145a02709f7>:5) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
Errors may have originated from an input operation.
Input Source operations connected to node MatMul:
b (defined at <ipython-input-9-b145a02709f7>:4)
a (defined at <ipython-input-9-b145a02709f7>:3)
But, if I ran this code from Terminal in Python, it works. I can see the output
[[22. 28.] [49. 64.]]
Upvotes: 3
Views: 7962
Reputation: 1924
You need to make sure you have the appropriate CUDA
versions installed.
version with the advice from this link: How to verify CuDNN installation?
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
on a linux machineCUDA
version here:
nvcc -V
s here: tensorflow xla and here: github xla_gpu issue
without CuDNN
calls gpu
s xla_gpu
s. Nvidia gpus need CUDA and CuDNN to work properly with Tensorflow, so it looks like tensorflow is trying to use its own library to compute on the GPU. But, I'm not really sure.Upvotes: 4