ttb
ttb

Reputation: 229

Tensorflow dataloading issue

I'm working with an example program that uses the MNIST dataset.

It tries to load the dataset using this line:

dataset = tfds.load(name='mnist', split=split)

However, this yields the following error:

2020-07-30 12:08:17.926262: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".
Traceback (most recent call last):
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/utils/py_utils.py", line 399, in try_reraise
yield
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/registered.py", line 244, in builder
return builder_cls(name)(**builder_kwargs)
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/api_utils.py", line 69, in disallow_positional_args_dec
return fn(*args, **kwargs)
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/dataset_builder.py", line 206, in __init__
self.info.initialize_from_bucket()
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/dataset_info.py", line 423, in initialize_from_bucket
data_files = gcs_utils.gcs_dataset_info_files(self.full_name)
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/utils/gcs_utils.py", line 71, in gcs_dataset_info_files
return gcs_listdir(posixpath.join(GCS_DATASET_INFO_DIR, dataset_dir))
File "/home/tflynn/pylocal/lib/python3.7/site- 
packages/tensorflow_datasets/core/utils/gcs_utils.py", line 64, in gcs_listdir
if is_gcs_disabled() or not tf.io.gfile.exists(root_dir):
File "/home/tflynn/.local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", 
line 267, in file_exists_v2
_pywrap_file_io.FileExists(compat.as_bytes(path))
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error executing an HTTP 
request: libcurl code 77 meaning 'Problem with the SSL CA cert (path? access rights?)', 
error details: error setting certificate verify locations:
CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
when reading metadata of gs://tfds-data/dataset_info/mnist/3.0.1

I've searched on google, but couldn't find any other instances of this error with tensorflow. The node is connected to the internet, if that makes a difference.

Upvotes: 2

Views: 3544

Answers (4)

adam.lofts
adam.lofts

Reputation: 1162

The error is saying CURL couldn't find the file on your computer with root SSL certificates. On my machine this file is stored at /etc/ssl/certs/ca-bundle.crt.

You can override the path CURL looks for by setting the CURL_CA_BUNDLE environment variable. For example, either add this to the top of your python script/notebook:

import os
os.environ['CURL_CA_BUNDLE'] = "/etc/ssl/certs/ca-bundle.crt"

or you could set the environment variable in a shell, e.g.

export CURL_CA_BUNDLE=/etc/ssl/certs/ca-bundle.crt

Upvotes: 2

Jason Stock
Jason Stock

Reputation: 13

If you are unable to create a symbolic link or install using apt, you can try to upgrade to a more recent version of tfds. This issue is not present in the nightly build version 3.2.1.

pip install tfds-nightly: Released every day, contains the last versions of the datasets.

According to the TensorFlow/datasets GitHub repository, one commenter suggests to downgrade to 3.0.0, however; I have not tried this to see if it works.

Upvotes: 1

ørk
ørk

Reputation: 61

I've had the same issue on an Fedora32 system. The directory /etc/ssl/certs/ exists and there is a file ca-bundle.crt. The following command solved the problem:

sudo ln -s /etc/ssl/certs/ca-bundle.crt /etc/ssl/certs/ca-certificates.crt

Upvotes: 6

Lorenz Kummer
Lorenz Kummer

Reputation: 75

Probably same as [1]. Run

apt-get update apt-get install -y ca-certificates

if on linux before executing your code or commands of similar effect on your OS.

[1] https://github.com/tensorflow/serving/issues/1022

Upvotes: 1

Related Questions