Reputation: 2624
I'm trying to run tensorflow-gpu on Windows 10 on a Laptop with a Quadro GPU
-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.21 Driver Version: 465.21 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro T2000 WDDM | 00000000:01:00.0 On | N/A |
| N/A 59C P0 14W / N/A | 2708MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
when trying to verify whether everything works fine I found that device_lib.list_local_devices() fails with
**RuntimeError: cudaGetDevice() failed. Status: invalid argument**
2021-01-13 11:30:14.735823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: Quadro T2000 computeCapability: 7.5
coreClock: 1.5GHz coreCount: 16 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.34GiB/s
2021-01-13 11:30:14.736173: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cudart64_110.dll
2021-01-13 11:30:14.736376: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cublas64_11.dll
2021-01-13 11:30:14.736590: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cublasLt64_11.dll
2021-01-13 11:30:14.736801: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cufft64_10.dll
2021-01-13 11:30:14.737016: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
curand64_10.dll
2021-01-13 11:30:14.737221: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cusolver64_10.dll
2021-01-13 11:30:14.737418: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cusparse64_11.dll
2021-01-13 11:30:14.737590: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library
cudnn64_8.dll
2021-01-13 11:30:14.737787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\D041705\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\client\device_lib.py", line
43, in list_local_devices
_convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: cudaGetDevice() failed. Status: invalid argument
Any hints way that happens? I have CUDA 11.2 python 3.8.7 and I installed the latest packages for tf and tf-gpu
Upvotes: 3
Views: 1015
Reputation: 824
Just solved this problem. I supposed the universal solution is downgrading your CUDA and GPU driver version.
First, according to the latest issue, TensorFlow 2.4 is not compatible with Cuda 11.2 or 11.1, using Cuda 11.0 instead.
Second, if you are using the latest GPU driver, you will found the CUDA Version is 11.3 while running nvidia-smi
. Downgrading GPU driver to an old version,
461.09 worked in my case.
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 461.09 Driver Version: 461.09 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
and
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:35_Pacific_Daylight_Time_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.relgpu_drvr445TC445_37.28845127_0
To test if Cuda is working, you should use Tensorflow for calculation, list devices may not help.
import tensorflow as tf
tf.debugging.set_log_device_placement(True)
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
output:
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
Upvotes: 2