Reputation:
Here's the code:
# import libraries
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
# import dataset
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
training_set = train_datagen.flow_from_directory(
'data/spectrogramme/ensemble_de_formation',
target_size = (64, 64),
batch_size = 128,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('data/spectrogramme/ensemble_de_test',
target_size = (64, 64),
batch_size = 128,
class_mode = 'binary')
# initializing
reseau = Sequential()
# 1. convolution
reseau.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
reseau.add(MaxPooling2D(pool_size = (2, 2)))
reseau.add(Conv2D(32, (3, 3), activation = 'relu'))
reseau.add(MaxPooling2D(pool_size = (2, 2)))
reseau.add(Conv2D(64, (3, 3), activation = 'relu'))
reseau.add(MaxPooling2D(pool_size = (2, 2)))
reseau.add(Conv2D(64, (3, 3), activation = 'relu'))
reseau.add(MaxPooling2D(pool_size = (2, 2)))
# 2. flatenning
reseau.add(Flatten())
# 3. fully connected
from keras.layers import Dropout
reseau.add(Dense(units = 64, activation = 'relu'))
reseau.add(Dropout(0.1))
reseau.add(Dense(units = 128, activation = 'relu'))
reseau.add(Dropout(0.05))
reseau.add(Dense(units = 256, activation = 'relu'))
reseau.add(Dropout(0.03))
reseau.add(Dense(units = 1, activation = 'sigmoid'))
# 4. compile
reseau.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# 5. fit
reseau.fit_generator(training_set, steps_per_epoch = 8000, epochs = 1,
validation_data = test_set, validation_steps = 2000)
This should prove that I have tensorflow GPU with CUDA and CUDNN installed pic
I don't know what to do, I have reinstalled CUDA and CUDNN multiple times
HOWEVER, if I uninstall tensorflow-gpu, the program runs flawlessly... with the exception of needing 5000 seconds per epoch... I'd like to avoid that
FYI, this is all happening on Windows
Any help is appreciated.
Upvotes: 13
Views: 45376
Reputation: 36
I had a similar problem while trying to run the simples neural network with keras library.
model = Sequential()
model.add(Input(shape=(vocab_size, )))
model.add(Dense(embed_size, activation="linear"))
model.add(Dense(vocab_size, activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam")
I was running on Apple M2, but kernel kept dying and restart notification would pop up on Jupyter.
It used fail every time I ran,
model.fit(X, y, epochs=1000)
pip install --upgrade tensorflow
this worked, as it updated the existing tensorflow, keras and other dependent libraries, hope it helps someone!
Upvotes: 0
Reputation: 403
Please check cudnn.
I had same problem and it was solved after using correct cudnn
Upvotes: 0
Reputation: 31
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
This solution is provided by Krishna Kankipati at Kaggle site
Upvotes: 1
Reputation: 41
The CUDA, CuDNN, Tensorflow and Python Version Compatibility table can be referred at https://www.tensorflow.org/install/source#gpu but I did with the following version installation and it works perfectly.
The problem can be solve by:
This is working for me. I was not placing the zlibwapi.dll in the CUDA/bin folder earlier, that was the reason I faced the same problem.
I hope this helps.
Upvotes: 0
Reputation: 101
I had a similar problem because I had cuda and cuDNN versions way higher than what is mentioned in the compatibility chart. The Dense layers would work fine for me but using Conv2D/Conv3D would kill my kernel.
Solution
Make sure you have the zlib file copied and pasted into your CUDA\v11.x\bin directory. I had issues downloading it from NVIDIA's website but found a way around.
In NVIDIA website, they referred to zlibwapi.dll- I was able to locate this file in “C:\Program Files\Microsoft Office\root\Office16\ODBC Drivers\Salesforce\lib” (I installed using Microsoft 365 x64 in windows 11) and copy pasted this file into “C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin” I was able to run Tensorflow 2.8.0 thereafter
Thanks to srikkanth_kn's solution I was able to find the zlibwapi.dll file (in MS Office) and pasted it into the CUDA's bin folder (make sure CUDA's bin folder is in your PATH). After that everything was working fine. Hope this helps you and saves your time.
Upvotes: 3
Reputation: 1
I had Exactelly the same problem, I tried every solutions mentioned in this post and never works. After soo much tries, I found the problem, was the cuda installation, during the installation. I followed the Nvidia tutorial, but, at step of copy the 3 files from cudnn directory (like as tutorial) you should copy the 3 paths and just paste (substitute) at the nvidia directory, after this, my gpu works wothout problems
Upvotes: 0
Reputation: 1
I had the same issue. After all, running the file as .py helped to see the problem was with cuDNN. Not all files were installed.
Upvotes: 0
Reputation: 2655
A very cumbersome issue with tensorflow-gpu. It took me days to find the best working solution.
What seems to be the problem:
I know you might have installed cudnn and cuda (just like me) after watching youtube videos or internet documentation. But since cuda and cudnn are very strict about version clashes so it's possible that there might have been a version mismatch between your tensorflow , cuda or cudnn version.
What's the solution:
The tensorflow
build automatically selected by Anaconda on Windows 10 during the installation of tensorflow-gpu
2.3 seems to be faulty. Please find a workaround here (consider upvoting the GitHub answer if you have a GitHub account).
Python 3.7: conda install tensorflow-gpu=2.3 tensorflow=2.3=mkl_py37h936c3e2_0
Python 3.8: conda install tensorflow-gpu=2.3 tensorflow=2.3=mkl_py38h1fcfbd6_0
These snippets automatically download cuda and cudnn drivers along with the tensorflow-gpu. After trying out this solution i was able to fit()
the tensorflow models as well as boost up the speed due to GPU installed.
A word of advice:
If you are working with machine learning / data science. I would strongly advice you shift to anaconda instead of pip. This would allow you to create virtual environments and easy integration with jupyter-notebooks. You can create a separate virtual environment for machine learning tasks as they often require upgradation or downgradation of libraries. With virtual environments it won't hurt your other packages outside the environment.
Upvotes: 4
Reputation: 1214
I had the same problem. In my case, the Notebook kernel was crashing as soon as I run the block with all model.add() code.
I went to Jupyter Home and found out that another notebook, which I had used earlier to train a model on GPU, was running, even though I had closed the notebook browser tab. As suggested by @Ian Henry. I shutdown the ones I wasn't using, restarted the kernel and run all the blocks again, and this time it worked perfectly fine.
Note that, the notebooks run in background even when you close the browser. You can verify this with if you check the icon for the respective notebook, which should be green if running and grey if not. To shutdown the running notebook, simply go to the Running tab, anc click the shutdown button next to the notebook name
Upvotes: 1
Reputation: 1463
I had the same issue running model.fit() on Jupyter Notebook. A good starting point for debugging is always downloading the notebook as a .py file and run it. This way you get all errors and warnings.
In terms of a solution - I doubt that this will solve most cases, but I installed cuDNN 7.2(.1) via .deb files, reinstalled tensorflow-gpu, and it worked. After all, it wasn't a version issue the driver (I had CUDA 9.0 and 384.xx which was correct), but one with cuDNN.
Upvotes: 0
Reputation: 3508
The problem is with the Jupyter notebook. I have the same problem going on with Jupyter notebook. If you run the same code in CPU based environment or in Terminal with GPU, it will work for sure.
Upvotes: 0
Reputation: 21
If you are using Jupyter check for any running notebooks, and as I've found that they hang on to the GPU memory even when they are actively running.
In jupyter shutdown any unused running ones.
Upvotes: 0