Reputation: 4251
I'm using Google Cloud Storage for a few of my websites by storing static & downloadable content like css, js, images, pdf etc.
How can I find out which of my files and in which bucket is being downloaded a lot ?
My billing just shows :
Cloud Storage Download APAC: 924.637 Gibibytes (Source:Google Storage Project [gs-project-name]) - US$110.84
This is a lot for a month and I want to delete those files.
Is there any stats for this ?
Upvotes: 2
Views: 623
Reputation: 91
The currently available way to view your usage broken down by bucket or object is to enable access logs for each bucket. Once enabled, GCS will export CSV files that contain information about all requests made to objects in those buckets. This information can be aggregated to find to top objects / buckets that are being downloaded:
See: https://cloud.google.com/storage/docs/access-logs
Create a bucket to store the usage logs:
gsutil mb gs://my-logs-bucket
gsutil acl ch -g [email protected]:W gs://my-logs-bucket
gsutil defacl set project-private gs://my-logs-bucket
Enable usage logging for all your buckets:
gsutil logging set on -b gs://my-logs-bucket gs://my-bucket1
gsutil logging set on -b gs://my-logs-bucket gs://my-bucket2
..
At the end of the month, either download the CSVs in gs://my-logs-bucket and analyse them, or load them into BigQuery for analysis:
wget http://storage.googleapis.com/pub/cloud_storage_usage_schema_v0.json
bq mk storageanalysis
bq load --skip_leading_rows=1 storageanalysis.usage \
gs://my-logs-bucket/*_usage_* ./cloud_storage_usage_schema.json
bq shell
> SELECT cs_object, SUM(sc_bytes) AS sc_bytes
FROM [storageanalysis.usage]
ORDER BY sc_bytes desc LIMIT 20
..
> SELECT cs_bucket, SUM(sc_bytes) AS sc_bytes
FROM [storageanalysis.usage]
ORDER BY sc_bytes desc LIMIT 20
..
> QUIT
Upvotes: 3