Reputation: 1035
I'm new to python and boto and I'm currently trying to write a dag that will check an s3 file size given the bucket location and file name. How can I take the file location (s3://bucket-info/folder/filename) and get the size of the file? If the file size is greater than 0kb, I will need to fail the job.
Thank you for your time
Upvotes: 0
Views: 10315
Reputation: 1693
You can also get a list of all objects if multiple files need to be checked. For a given bucket run list_objects_v2
and then iterate through response 'Contents'. For example:
s3_client = boto3.client('s3')
response_contents = s3_client.list_objects_v2(
Bucket='name_of_bucket'
).get('Contents')
you'll get a list of dictionaries like this:
[{'Key': 'path/to/object1', 'LastModified': datetime, 'ETag': '"some etag"', 'Size': 2600, 'StorageClass': 'STANDARD'}, {'Key': 'path/to/object2', 'LastModified': 'datetime', 'ETag': '"some etag"', 'Size': 454, 'StorageClass': 'STANDARD'}, ... ]
Notice that each dictionary in the list contains 'Size' key, which is the size of your particular object. It's iterable
for rc in response_contents:
if rc.get('Key') == 'path/to/file':
print(f"Size: {rc.get('Size')}")
You get sizes for all files you might be interested in:
Size: 2600
Size: 454
Size: 2600
...
Upvotes: 1
Reputation: 4480
You can use boto3 head_object for this
Here's something that will get you the size. Replace bucket and key with your own values:
import boto3
client = boto3.client(service_name='s3', use_ssl=True)
response = client.head_object(
Bucket='bucketname',
Key='full/path/to/file.jpg'
)
print(response['ContentLength'])
Upvotes: 2