opus111
opus111

Reputation: 2854

google ml-engine cloud storage as a file

I am working in Python with Google Cloud ML-Engine. The documentation I have found indicates that data storage should be done with Buckets and Blobs

https://cloud.google.com/ml-engine/docs/tensorflow/working-with-cloud-storage

However, much of my code, and the libraries it calls works with files. Can I somehow treat Google Storage as a file system in my ml-engine code?

I want my code to read like

with open(<something>) as f:
   for line in f:
      dosomething(line)

Note that in ml-engine one does not create and configure VM instances. So I can not mount my own shared filesystem with Filestore.

Upvotes: 1

Views: 271

Answers (2)

opus111
opus111

Reputation: 2854

For those that come after, here is the answer

Google Cloud ML and GCS Bucket issues

from tensorflow.python.lib.io import file_io

Here is an example

with file_io.FileIO("gc://bucket_name/foobar.txt","w") as f:
    f.write("FOO")
    f.flush()
    print("Write foobar.txt")

with file_io.FileIO("gc://bucket_name/foobar.txt","r") as f:
    for line in f:
        print("Read foobar.txt: "+line)

Upvotes: 2

Dan Cornilescu
Dan Cornilescu

Reputation: 39814

The only way to have Cloud Storage appear as a filesystem is to mount a bucket as a file system:

You can use the Google Cloud Storage FUSE tool to mount a Cloud Storage bucket to your Compute Engine instance. The mounted bucket behaves similarly to a persistent disk even though Cloud Storage buckets are object storage.

But you cannot do that if you can't create and configure VMs.

Note that in ml-engine one does not create and configure VM instances.

That's not entirely true. I see ML Engine supports building custom containers, which is typically how one can install and configure OS-level dependencies. But only for the training area, so if your needs are in that area it may be worth a try.

I assume you already checked that the library doesn't support access through an already open file-like handler (if not then maybe of interest would be How to restore Tensorflow model from Google bucket without writing to filesystem?)

Upvotes: 2

Related Questions