prasanna Kumar
prasanna Kumar

Reputation: 51

Get size of the bucket based on storage classes in Google cloud storage

I would like to get the size of the bucket based on storage class . I've added rules to bucket to change the storage class of the files based on age of the file. I've used below commands

gsutil du -sh gs://[bucket-name] 

To get Meta-data : 
gsutil ls -L gs://[bucket-name] 

To set ACL to bucket 

gsutil lifecycle set life-cycle.json gs://[bucket-name] 

Please any one help on this to resolve my issue

Upvotes: 1

Views: 1720

Answers (1)

Maxim
Maxim

Reputation: 4441

Edit:

I have filed a Feature Request for this on the Public Issue Tracker. In the meantime, the code below can be used.

I believe there is no gsutil command that can show you the total size by storage class for a GCS bucket.

However, using the Cloud Storage Client Libraries for Python, I made a script that does what you’re asking for:

from google.cloud import storage
import math

### SET THESE VARIABLES ###
PROJECT_ID = "" 
CLOUD_STORAGE_BUCKET = "" 
###########################

def _get_storage_client():
    return storage.Client(
        project=PROJECT_ID)

def convert_size(size_bytes):
   if size_bytes == 0:
       return "0 B"
   size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
   i = int(math.floor(math.log(size_bytes, 1024)))
   p = math.pow(1024, i)
   s = round(size_bytes / p, 2)
   return "%s %s" % (s, size_name[i])

def size_by_class():
    client = _get_storage_client()
    bucket = client.bucket(CLOUD_STORAGE_BUCKET)
    blobs = bucket.list_blobs()

    size_multi_regional = size_regional = size_nearline = size_coldline = 0
    for blob in blobs:
        if blob.storage_class == "MULTI_REGIONAL":
            size_multi_regional = size_multi_regional + blob.size
        if blob.storage_class == "REGIONAL":
            size_regional = size_regional + blob.size
        if blob.storage_class == "NEARLINE":
            size_nearline = size_nearline + blob.size
        if blob.storage_class == "COLDLINE":
            size_coldline = size_coldline + blob.size

    print("MULTI_REGIONAL: "+str(convert_size(size_multi_regional))+"\n"+
            "REGIONAL: "+str(convert_size(size_regional)+"\n"+
            "NEARLINE: "+str(convert_size(size_nearline))+"\n"+
            "COLDLINE: "+str(convert_size(size_coldline))
            ))

if __name__ == '__main__':
    size_by_class()

To run this program from the Google Cloud Shell, make sure you have previously installed the Client Library for Python with:

pip install --upgrade google-cloud-storage

And in order to provide authentication credentials to the application code, you must point the environment variable GOOGLE_APPLICATION_CREDENTIALS to the location of the JSON file that contains your service account key:

export `GOOGLE_APPLICATION_CREDENTIALS`="/home/user/Downloads/[FILE_NAME].json"

Before running the script, set PROJECT_ID to the ID of your Project, and CLOUD_STORAGE_BUCKET to the name of your GCS Bucket.

Run the script with python main.py. Output should be something like:

MULTI_REGIONAL: 1.0 GB
REGIONAL: 300 MB
NEARLINE: 200 MB
COLDLINE: 10 MB

Upvotes: 2

Related Questions