Reputation: 2854
I am working in Python with Google Cloud ML-Engine. The documentation I have found indicates that data storage should be done with Buckets and Blobs
https://cloud.google.com/ml-engine/docs/tensorflow/working-with-cloud-storage
However, much of my code, and the libraries it calls works with files. Can I somehow treat Google Storage as a file system in my ml-engine code?
I want my code to read like
with open(<something>) as f:
for line in f:
dosomething(line)
Note that in ml-engine one does not create and configure VM instances. So I can not mount my own shared filesystem with Filestore.
Upvotes: 1
Views: 271
Reputation: 2854
For those that come after, here is the answer
Google Cloud ML and GCS Bucket issues
from tensorflow.python.lib.io import file_io
Here is an example
with file_io.FileIO("gc://bucket_name/foobar.txt","w") as f:
f.write("FOO")
f.flush()
print("Write foobar.txt")
with file_io.FileIO("gc://bucket_name/foobar.txt","r") as f:
for line in f:
print("Read foobar.txt: "+line)
Upvotes: 2
Reputation: 39814
The only way to have Cloud Storage appear as a filesystem is to mount a bucket as a file system:
You can use the Google Cloud Storage FUSE tool to mount a Cloud Storage bucket to your Compute Engine instance. The mounted bucket behaves similarly to a persistent disk even though Cloud Storage buckets are object storage.
But you cannot do that if you can't create and configure VMs.
Note that in ml-engine one does not create and configure VM instances.
That's not entirely true. I see ML Engine supports building custom containers, which is typically how one can install and configure OS-level dependencies. But only for the training area, so if your needs are in that area it may be worth a try.
I assume you already checked that the library doesn't support access through an already open file-like handler (if not then maybe of interest would be How to restore Tensorflow model from Google bucket without writing to filesystem?)
Upvotes: 2