Reputation: 8165
I have a dumb question.
So I have terrabytes of data to rsync between two GCP buckets.
I'm not too sure with how gsutil rsync
works behind the scenes.
Does it have to download the files locally before it uploads it to destination or does it just magically move things over from source bucket to destination?
Upvotes: 1
Views: 1880
Reputation: 4640
I performed a test with RSYNC
and the debug flags and I noticed this behaviour
When you move an object (using cp or rsync) between buckets this is not downloaded to your local machine, I used a file of ~4GB and glances
to measure the network usage during rsync operation, the objects were directly moved to the target bucket
If you run the following command you going to notice that the SDK perform a post request indicating the movement between buckets
gsutil -d rsync gs://sourcebucket gs://targetbucket
https://storage.googleapis.com/storage/v1/b/sourcebucket/o/bigfile.iso/rewriteTo/b/targetbucket/o/bigfile.iso
Rewriteto
behaviour is documented here
Upvotes: 3
Reputation: 317712
The answer to your question is in the gsutil rsync documentation:
Note 2: If you are synchronizing a large amount of data between clouds you might consider setting up a Google Compute Engine account and running gsutil there. Since cross-provider gsutil data transfers flow through the machine where gsutil is running, doing this can make your transfer run significantly faster than running gsutil on your local workstation.
So yes, it downloads the content locally first, then uploads it to the destination.
Upvotes: 3