Reputation: 5
I'm trying to run tensorflow-gpu 2.0 on Windows 10 in a conda environment, the code is actually the basic tutorial on TensorFlow page
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
and I don't understand the error and have already uninstalled and installed again could it be that I have not installed yet keras-gpu?, I am just getting started with this library pls help :(
Epoch 1/5
2020-01-24 23:40:35.430377: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-01-24 23:40:35.923375: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.933612: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.941088: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.952234: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.961783: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.970378: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-01-24 23:40:35.976378: W tensorflow/stream_executor/stream.cc:1919] attempting to perform BLAS operation using StreamExecutor without BLAS support
2020-01-24 23:40:35.986426: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Internal: Blas GEMM launch failed : a.shape=(32, 784), b.shape=(784, 128), m=32, n=128, k=784
[[{{node sequential/dense/MatMul}}]]
32/60000 [..............................] - ETA: 2:37:06Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 728, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 324, in fit
total_epochs=epochs)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 123, in run_one_epoch
batch_outs = execution_function(iterator)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 86, in execution_function
distributed_function(input_fn))
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 457, in __call__
result = self._call(*args, **kwds)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 520, in _call
return self._stateless_fn(*args, **kwds)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\function.py", line 1823, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\function.py", line 1141, in _filtered_call
self.captured_inputs)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\function.py", line 1224, in _call_flat
ctx, args, cancellation_manager=cancellation_manager)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\function.py", line 511, in call
ctx=ctx)
File "C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(32, 784), b.shape=(784, 128), m=32, n=128, k=784
[[node sequential/dense/MatMul (defined at C:\Users\igorr_z1q8wib\.conda\envs\tf_gpu\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_distributed_function_706]
Function call stack:
distributed_function
>>>
>>> model.evaluate(x_test, y_test, verbose=2)
2020-01-24 23:40:36.878248: I tensorflow/stream_executor/stream.cc:1868] [stream=000002DA3ACFDB20,impl=000002DA3B9C8060] did not wait for [stream=000002DA3ACFD9A0,impl=000002DA3B9C7F70]
2020-01-24 23:40:36.892612: I tensorflow/stream_executor/stream.cc:4816] [stream=000002DA3ACFDB20,impl=000002DA3B9C8060] did not memcpy host-to-device; source: 000002DAA3AF8C80
2020-01-24 23:40:36.901014: F tensorflow/core/common_runtime/gpu/gpu_util.cc:342] CPU->GPU Memcpy failed```
Upvotes: 0
Views: 2803
Reputation: 207
Igor, are you setting the GPU device?
https://devblogs.nvidia.com/cuda-pro-tip-always-set-current-device-avoid-multithreading-bugs/
https://www.tensorflow.org/guide/gpu
from \__future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
## it's possible to set the device manually
tf.debugging.set_log_device_placement(True)
# Place tensors on the CPU
with tf.device('/CPU:0'):
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
print(c)
with tf.device('/CPU:0'):
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
Upvotes: 2