Della
Della

Reputation: 1632

How to know the size of an Azure blob object via Python Azure SDK

Following the Microsoft Azure documentation for Python developers. The azure.storage.blob.models.Blob class does have a private method called __sizeof__(). But it returns a constant value of 16, whether the blob is empty (0 byte) or 1 GB. Is there any method/attribute of a blob object with which I can dynamically check the size of the object?

To be clearer, this is how my source code looks like.

for i in blobService.list_blobs(container_name=container, prefix=path):
    if i.name.endswith('.json') and r'CIJSONTM.json/part' in i.name:
        #do some stuffs

However, the data pool contains many empty blobs having legitimate names, and before I #do some stuffs, I want to have an additional check on the size to judge whether I am dealing with an empty blob.

Also, bonus for what exactly does the __sizeof__() method give, if not the size of the blob object?

Upvotes: 3

Views: 12031

Answers (3)

PC_User
PC_User

Reputation: 97

In case anyone's wondering how to do this using the v12 sdk, i.e., not using BlockBlobService.

connection_string = "DefaultEndpointsProtocol=https;AccountName=YOUR-ACCOUNT;AccountKey=YOUR-KEY;EndpointSuffix=core.windows.net"
container_name = YOUR-CONTAINER

blob_client = BlobClient.from_connection_string(conn_str=connection_string, container_name = container_name, blob_name = YOUR-BLOB)

blob_client.get_blob_properties().size

This will give you the size in bytes, according to the documentation

Upvotes: 2

Ailie Tourlog
Ailie Tourlog

Reputation: 41

from azure.storage.blob import BlobServiceClient

blob_service_client = BlobServiceClient.from_connection_string(connect_str)
blob_list = blob_service_client.get_container_client(my_container).list_blobs()
for blob in blob_list:
    print("\t" + blob.name)
    print('\tsize=', blob.size)

Upvotes: 4

Tom Sun
Tom Sun

Reputation: 24569

I want to have an additional check on the size to judge whether I am dealing with an empty blob.

We could use the BlobProperties().content_length to check whether it is a empty blob.

BlockBlobService.get_blob_properties(block_blob_service,container_name,blob_name).properties.content_length

The following is the demo code how to get the blob content_length :

from azure.storage.blob import BlockBlobService
block_blob_service = BlockBlobService(account_name='accoutName', account_key='accountKey')
container_name ='containerName'
block_blob_service.create_container(container_name)
generator = block_blob_service.list_blobs(container_name)
for blob in generator:
    length = BlockBlobService.get_blob_properties(block_blob_service,container_name,blob.name).properties.content_length
    print("\t Blob name: " + blob.name)
    print(length)

Upvotes: 7

Related Questions