Reputation: 143
NCBI (the National Center for Biotech Info) generously provided their data for 3rd parties to consume. The data is located in cloud buckets such as gs://sra-pub-run-1/
. I would like to read this data without incurring additional costs, which I believe can be achieved by reading it from the same region as where the bucket is hosted. Unfortunately, I can't figure out in which region the bucket is hosted (NCBI mentions in their docs that's in the US, but not where in the US). So my questions are:
gs://sra-pub-run-1/
is hosted?Doing a simple gsutil ls -b -L
either provides no information (when listing a specific directory within sra-pub-run-1
or I get a permission denied error if I try to list info on gs://sra-pub-run-1/
directly using:
gsutil -u metagraph ls -b gs://sra-pub-run-1/
Upvotes: 2
Views: 4195
Reputation: 38379
You cannot specify a specific Compute Engine zone as a bucket location, but all Compute Engine VM instances in zones within a given region have similar performance when accessing buckets in that region.
Billing-wise, egressing data from Cloud Storage into a Compute Engine instance in the same location/region (for example, US-EAST1 to US-EAST1) is free, regardless of zone.
So, check the "Location constraint" of the GCS bucket (gsutil ls -Lb gs://bucketname
), and if it says "US-EAST1", and if your GCE instance is also in US-EAST1, downloading data from that GCS bucket will not incur an egress fee.
Upvotes: 1