Reputation: 371
I have an application which processes CSV files and returns some analysis. My users have files stored in GCP Cloud Storage buckets, and I would like to enable them to pass me a bucket URL and some auth token / signed URL, and the application will then download the files and parse them as needed.
Reading the GCP documentation I came upon the following gsutil command:
gsutil cp -r gs://my_bucket
This is exactly what I need, however I am looking for this same functionality through some REST API HTTP request. I am certain something like this exists, but cannot seem to find it. Alternatively if I could "list" all files in a bucket and download them one by one this would also be OK, but obviously less convenient.
Upvotes: 1
Views: 3299
Reputation: 3617
You can make calls to any of the two REST APIs: JSON or XML.
For downloading a file from a public Google Cloud Storage bucket, use cURL to make a GET Object request to
https://www.googleapis.com/storage/v1/b/<bucket>/o/<object>
, where <bucket>
is the name of your Google Cloud Storage bucket and <object>
is the name of a file in the bucket. This should work with the authorization access token from the OAuth 2.0
JSON API:
curl -X GET \ -H "Authorization: Bearer [OAUTH2_TOKEN]" \ -o "[SAVE_TO_LOCATION]" \ "https://www.googleapis.com/storage/v1/b/[BUCKET_NAME]/o/[OBJECT_NAME]?alt=media"
XML API:
curl -X GET \ -H "Authorization: Bearer [OAUTH2_TOKEN]" \ -o "[SAVE_TO_LOCATION]" \ "https://storage.googleapis.com/[BUCKET_NAME]/[OBJECT_NAME]"
You can read the docs for this API request here. We have code samples with a number of client libraries/languages (Python API, Node.js,Java) to show how we can download objects from buckets in Cloud Storage.
Note that for multiple files, you will have to program the requests, so if you want to easily download all the objects in a bucket or subdirectory, it's better to use gsutil
instead. Also for the transfer you might want to use the gsutil -m
option, to perform a parallel (multi-threaded/multi-processing) copy:
gsutil -m cp -R gs://your-bucket
If you want to copy into a particular directory, note that the directory must exist first, as gsutils won't create it automatically.
mkdir my-bucket-local-copy && gsutil -m cp -r gs://your-bucket my-bucket-local-copy
The time reduction for downloading the files can be quite significant. See this Cloud Storage documentation for complete information on the GCS cp command. Also have a look at this stackoverflow thread to understand how we can download a folder from Cloud Storage bucket.
If you need to perform authenticated downloads, Google Cloud Storage also supports signed URLs for download. These URLs describe specific operations on Google Cloud Storage, such as download, and come with a time-sensitive signature. Anyone who has the URL can perform the specified operation on Google Cloud Storage. They're safe to pass around from server to client but when working with Signed URLs we have to keep in mind some considerations. We have code samples with a number of client libraries/languages which create signed URLs to download object.
Upvotes: 0
Reputation: 2396
Unfortunately it's not possible to achieve what you're asking, the only solution, as you proposed, is to list files and download them one by one (which is what the gsutil
command is doing under the hood).
Even the code samples documentation states
To easily download all objects in a bucket or subdirectory, use the gsutil cp command.
You could, however, use subprocess to call the gsutil command in your python script.
Upvotes: 1
Reputation: 2199
The api reference can be found here: https://cloud.google.com/storage/docs/apis
You will probably need to combine the information in 'authenticating to the API' and 'JSON API -> API reference -> objects -> get'.
Alternatively you could have found this information in the cloud storage how-to guides: https://cloud.google.com/storage/docs/downloading-objects#rest-download-object
Upvotes: 0