Reputation: 2267
I am loosely following a tutorial to train a TensorFlow estimator on Google Cloud AI Platform.
I would like to access a directory that contains my training and evaluation data, and to this end I have copied my data files recursively to Google Storage like this:
gsutil cp -r data gs://name-of-my-bucket/data
This works fine, and gsutil ls gs://name-of-my-bucket/data
correctly returns:
gs://name-of-my-bucket/data/test.json
gs://name-of-my-bucket/data/test
gs://name-of-my-bucket/data/train
However, calling os.listdir(data_dir)
from a Python script raises a FileNotFoundError
for any value of data_dir
that I've tried so far, including 'data/'
and 'name-of-my-bucket/data/'
. Why?
I know that my Python script is being executed from the directory /root/.local/lib/python3.7/site-packages/trainer/
/user_dir
.
Here is the code that precedes the line where the error arises, directly from the __main__
section of my Python script:
PARSER = argparse.ArgumentParser()
PARSER.add_argument('--job-dir', ...)
PARSER.add_argument('--eval-steps', ...)
PARSER.add_argument('--export-format', ...)
ARGS = PARSER.parse_args()
tf.logging.set_verbosity('INFO')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = str(tf.logging.__dict__['INFO'] / 10)
HPARAMS = hparam.HParams(**ARGS.__dict__)
Here is the line of code where the error arises (first line of a separate function that gets invoked right after the lines of code I have reported above):
mug_dirs = [f for f in os.listdir(image_dir) if not f.startswith('.')]
My logs for this job are a list of infos (plus 5 deprecation warnings related to TensorFlow), and then an error from the master-replica-0
task:
Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/.local/lib/python3.7/site-packages/trainer/final_task.py", line 114, in <module> train_model(HPARAMS) File "/root/.local/lib/python3.7/site-packages/trainer/final_task.py", line 55, in train_model (train_data, train_labels) = data.create_data_with_labels("data/train/") File "/root/.local/lib/python3.7/site-packages/trainer/data.py", line 13, in create_data_with_labels mug_dirs = [f for f in os.listdir(image_dir) if not f.startswith('.')] FileNotFoundError: [Errno 2] No such file or directory: 'data/train/'
... followed by another error from the same task (reporting non-zero exit status from my Python command), then two infos about clean-up, and finally an error from the service
task:
The replica master 0 exited with a non-zero status of 1. Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/.local/lib/python3.7/site-packages/trainer/final_task.py", line 114, in <module> train_model(HPARAMS) File "/root/.local/lib/python3.7/site-packages/trainer/final_task.py", line 55, in train_model (train_data, train_labels) = data.create_data_with_labels("data/train/") File "/root/.local/lib/python3.7/site-packages/trainer/data.py", line 13, in create_data_with_labels mug_dirs = [f for f in os.listdir(image_dir) if not f.startswith('.')] FileNotFoundError: [Errno 2] No such file or directory: 'data/train/' To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=1047296516162&resource=ml_job%2Fjob_id%2Fml6_run_25&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22ml6_run_25%22
Upvotes: 0
Views: 2454
Reputation: 61
You may use the tensorflow API to get all files in a directory on GCP. You can refer to their documentation: https://www.tensorflow.org/api_docs/python/tf/io/gfile/glob
For example, if you want to get all json files under your GCP, you can use this:
import tensorflow as tf
json_files = tf.io.gfile.glob("gs://name-of-my-bucket/data/"+"*.json")
Upvotes: 1
Reputation: 2368
Cloud Storage objects are a flat namespace and not contained in folders. Due to a more user-friendly experience, gsutil and the Google Cloud Storage UI will create an illusion of a hierarchical file tree. More info can be found on the documentation.
Now, if you are trying to read from a file object that is hosted on Cloud Storage, you may want to use the following documentation to download an object to your local directory using the Cloud Storage Client Libraries. Alternatively, you may as well use the gsutil cp command, which will allow you to copy data between your local directory and Cloud Storage buckets, among other options.
Once you download a replica object from a GCS bucket in your local directory, you will be able to manipulate said file as needed.
os.listdir
to access a GCS bucket object.Because Cloud Storage is a flat namespace, a Cloud Storage bucket, gs://my-bucket/data/test.json
will contain an object called data/test.json
stored in the root directory of gs://my-bucket
. Note that the object name includes /
characters. Therefore, if you would like access, for instance, your file test.json
in your bucket, you can check the documentation above and use data/test.json
as the reference - the concept of folder does not exist per se. Optionally, if you needed to access your train file object, you would use data/train
as the reference.
Upvotes: 0