JMV12
JMV12

Reputation: 1035

How To Use boto3 To Retrieve S3 File Size

I'm new to python and boto and I'm currently trying to write a dag that will check an s3 file size given the bucket location and file name. How can I take the file location (s3://bucket-info/folder/filename) and get the size of the file? If the file size is greater than 0kb, I will need to fail the job.

Thank you for your time

Upvotes: 0

Views: 10315

Answers (2)

Leo Skhrnkv
Leo Skhrnkv

Reputation: 1693

You can also get a list of all objects if multiple files need to be checked. For a given bucket run list_objects_v2 and then iterate through response 'Contents'. For example:

s3_client = boto3.client('s3')
response_contents = s3_client.list_objects_v2(
        Bucket='name_of_bucket'
        ).get('Contents')

you'll get a list of dictionaries like this:

[{'Key': 'path/to/object1', 'LastModified': datetime, 'ETag': '"some etag"', 'Size': 2600, 'StorageClass': 'STANDARD'}, {'Key': 'path/to/object2', 'LastModified': 'datetime', 'ETag': '"some etag"', 'Size': 454, 'StorageClass': 'STANDARD'}, ... ]

Notice that each dictionary in the list contains 'Size' key, which is the size of your particular object. It's iterable

for rc in response_contents:
    if rc.get('Key') == 'path/to/file':
        print(f"Size: {rc.get('Size')}")

You get sizes for all files you might be interested in:

Size: 2600
Size: 454
Size: 2600
...

Upvotes: 1

Ninad Gaikwad
Ninad Gaikwad

Reputation: 4480

You can use boto3 head_object for this

Here's something that will get you the size. Replace bucket and key with your own values:

import boto3

client = boto3.client(service_name='s3', use_ssl=True)

response = client.head_object(
    Bucket='bucketname',
    Key='full/path/to/file.jpg'
)
print(response['ContentLength'])

Upvotes: 2

Related Questions