Reputation: 21
I have been writing programs in Tensorflow on my computer, which uses Linux Mint. For whatever reason I can't get Tensorflow to operate on my GPU.
2021-04-26 15:46:11.462612: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2021-04-26 15:46:11.462650: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
I know for a fact that I have CUDA installed, because for PyTorch, the GPU works fine:
mydevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(mydevice)
yields
cuda
Also, I ran a program with tensorflow, and I get:
START TIME: Mon Apr 26 16:34:24 2021
2021-04-26 16:34:24.499178: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-26 16:34:24.499862: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-04-26 16:34:24.526372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-26 16:34:24.526781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2021-04-26 16:34:24.526900: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.526986: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.527069: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.528676: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-04-26 16:34:24.528994: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-04-26 16:34:24.530990: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-04-26 16:34:24.531125: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531230: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531245: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-04-26 16:34:24.531641: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-26 16:34:24.532140: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-26 16:34:24.532178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-26 16:34:24.532192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]
2021-04-26 16:34:24.592917: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-04-26 16:34:24.593369: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2400000000 Hz
I installed tensorflow on anaconda using conda, though the build is from PyPi, I believe. Please let me know your suggestions. Thank you.
Upvotes: 2
Views: 4122
Reputation: 75
It seems from your error logs that tensorflow IS picking up your GPU (GTX 1650). However, the problem is that the cudatoolkit and the cudnn version
could be incompatible with your tensorflow version. TF is rather specific with these requirements. The error lines that you need to take note are these:
2021-04-26 16:34:24.526900: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcudart.so.11.0'**; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.526986: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcublas.so.11'**; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.527069: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcublasLt.so.11'**; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.528676: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-04-26 16:34:24.531125: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcusparse.so.11'**; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531230: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcudnn.so.8'**; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
The latest tensorflow release tensorflow-2.4.0 (See full table) only plays nicely with : cuDNN 8.0 and CUDA 11.0 versions. (though newer versions of these have already been released -- you might need to check your version, I think you might be using CUDA 10).
I would suggest having a look at this post (older but the commands and principles still apply).
conda env create -f environment.yml
conda activate tensorflow_env_388
Nb. A fresh environment will avoid any conflicting packages.
conda list cudnn
# packages in environment at /rds/general/user/home/anaconda3/envs/tensorflow_env_388:
#
# Name Version Build Channel
cudnn 7.0.5.39 ha5ca753_1 conda-forge
conda list cudatoolkit
Then cudnn/cuda install as necessary
conda install cudatoolkit=11.0
conda install cudnn=8.0
Upvotes: 1
Reputation: 36608
What channel did you install it from? If you are using the default channel, you have to specify the GPU version of tensorflow.
conda install tensorflow=2.4.*=gpu* -c anaconda
Upvotes: 0