anjanesh
anjanesh

Reputation: 4251

How to find out which files / buckets are being publically downloaded the most?

I'm using Google Cloud Storage for a few of my websites by storing static & downloadable content like css, js, images, pdf etc.

How can I find out which of my files and in which bucket is being downloaded a lot ?

My billing just shows : Cloud Storage Download APAC: 924.637 Gibibytes (Source:Google Storage Project [gs-project-name]) - US$110.84
This is a lot for a month and I want to delete those files.

Is there any stats for this ?

Upvotes: 2

Views: 623

Answers (1)

Rohit Manokaran
Rohit Manokaran

Reputation: 91

The currently available way to view your usage broken down by bucket or object is to enable access logs for each bucket. Once enabled, GCS will export CSV files that contain information about all requests made to objects in those buckets. This information can be aggregated to find to top objects / buckets that are being downloaded:

See: https://cloud.google.com/storage/docs/access-logs

  1. Create a bucket to store the usage logs:

    gsutil mb gs://my-logs-bucket
    gsutil acl ch -g [email protected]:W gs://my-logs-bucket
    gsutil defacl set project-private gs://my-logs-bucket
    
  2. Enable usage logging for all your buckets:

    gsutil logging set on -b gs://my-logs-bucket gs://my-bucket1
    gsutil logging set on -b gs://my-logs-bucket gs://my-bucket2
    ..
    
  3. At the end of the month, either download the CSVs in gs://my-logs-bucket and analyse them, or load them into BigQuery for analysis:

    wget http://storage.googleapis.com/pub/cloud_storage_usage_schema_v0.json
    bq mk storageanalysis
    bq load --skip_leading_rows=1 storageanalysis.usage \
      gs://my-logs-bucket/*_usage_* ./cloud_storage_usage_schema.json
    
    bq shell
    > SELECT cs_object, SUM(sc_bytes) AS sc_bytes
      FROM [storageanalysis.usage]
      ORDER BY sc_bytes desc LIMIT 20
    ..
    > SELECT cs_bucket, SUM(sc_bytes) AS sc_bytes
      FROM [storageanalysis.usage]
      ORDER BY sc_bytes desc LIMIT 20
    ..
    > QUIT
    

Upvotes: 3

Related Questions