Reputation: 3198
I need to recursively move the contents of a sub-folder to a parent folder in google cloud storage. This code works for moving a single file from sub-folder to the parent.
client = storage.Client()
bucket = client.get_bucket(BUCKET_NAME)
source_path = Path(parent_dir, sub_folder, filename).as_posix()
source_blob = bucket.blob(source_path)
dest_path = Path(parent_dir, filename).as_posix()
bucket.copy_blob(source_blob, bucket, dest_path)
but I don't know how to properly format the command because if my dest_path is "parent_dir", I get the following error:
google.api_core.exceptions.NotFound: 404 POST https://storage.googleapis.com/storage/v1/b/bucket/o/parent_dir%2Fsubfolder/copyTo/b/geo-storage/o/parent_dir?prettyPrint=false: No such object: geo-storage/parent_dir/subfolder
Note: This works for recursive copy with gsutils but I would prefer to use the blob object:
os.system(f"gsutil cp -r gs://bucket/parent_dir/subfolder/* gs://bucket/parent_dir")
Upvotes: 0
Views: 858
Reputation: 9721
GCS does not have the concept of "directory" - just a flat namespace of objects. You can have an object named "foo/a.txt" and "foo/b.txt", but there is not actual thing representing "foo/" - its' just a prefix on the object names.
gsutil
uses prefixes on the object names to pretend directories exist, but under the hood it is really just acting on all the individual objects with that prefix.
You need to do the same, with a copy for each object:
client = storage.Client()
bucket = client.Bucket(BUCKET_NAME)
for source_blob in client.list_blobs(BUCKET_NAME, prefix="old/prefix/"):
dest_name = source_blob.name.replace("old/prefix/", "new/prefix/")
# do these in parallel for more speed
bucket.copy_blob(source_blob, bucket, new_name=dest_name)
Upvotes: 2