Reputation: 367
For directories on a local machine, the os.walk()
method is commonly used for walking a directory tree in Python.
Google has a Python module (google.cloud.storage
) for uploading to and downloading from a GCP bucket in a locally-run Python script.
I need a way to walk directory trees in a GCP bucket. I browsed through the classes in the google.cloud
Python module, but could not find anything. Is there a way to perform something similar to os.walk()
on directories inside a GCP bucket?
Upvotes: 7
Views: 2952
Reputation: 1991
import os
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('bucket_name')
for blob in bucket.list_blobs(prefix=''):
# Download the file
with open(blob.name, 'wb') as file_obj:
client.download_blob_to_file(blob, file_obj)
# You logic on the file
# logic goes here
# Remove the local file
os.remove(blob.name)
Upvotes: 1
Reputation: 38379
No such function exists in the GCS library. However, GCS can list objects by prefix, which is usually sufficiently equivalent:
from google.cloud import storage
bucket = storage.Client().get_bucket(bucket_name)
for blob in bucket.list_blobs(prefix="dir1/"):
print(blob.name)
Upvotes: 6