Reputation: 15374
Using boto and Python I am trying to differentiate whether a key is returning a folder of file (I am aware that S3 treats both exactly the same as I am not dealing with a filesystem directly).
What I have at the moment is 2 keys
<Key: my-folder,output/2019/01/28/>
<Key: my-folder,output/2019/01/28/part_1111>
The first being a "Folder" and the second being a "File". What I would like to do is determine whether the key is a "File" but unsure on how to determine this, the obvious is the key not ending in a /
but how would I determine that in Python.
If I was iterating over a list()
can I turn the key into a string or access the keys attributes?
for obj in srcBucket.list():
# Get the Key object of the given key, in the bucket
k = Key(srcBucket, obj.key)
print(k)
<Key: my-folder,output/2019/01/28/>
<Key: my-folder,output/2019/01/28/part_1111>
Upvotes: 4
Views: 13528
Reputation: 3134
You need to check both the 'Key' and the 'Size' of each object:
for object in boto3.client('s3').list_objects_v2(Bucket='my-bucket')['Contents']:
if object['Size'] == 0 and object['Key'][-1] == '/':
print(f"Folder: {object['Key']}, (size: {object['Size']})")
elif object['Size'] == 0:
print(f"Zero-length file: {object['Key']}, (size: {object['Size']})")
else:
print(f"File: {object['Key']}, (size: {object['Size']})")
File: a_folder/a_file.txt, (size: 543)
Folder: a_folder/, (size: 0)
Zero-length file: a_folder/zero_length_file.txt, (size: 0)
It is also not true that folder objects will necessarily have ['ContentType'] == 'application/x-directory'
:
for s3_obj_summary in boto3.resource('s3').Bucket('my-bucket').objects.all():
print(str(f"{s3_obj_summary.key}: {s3_obj_summary.get()['ContentType']}"))
a_folder/: application/octet-stream
Upvotes: 2
Reputation: 19260
with exception handling when there are no files found:
s3_client = boto3.client('s3')
paginator = s3_client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket="logs", Prefix="logs/access/2022122700/")
for page in pages:
try:
for obj in page['Contents']:
print(obj['Key']) # prints fileName
except KeyError:
print("Doesn't exist")
Upvotes: 0
Reputation: 1727
S3.Object and S3.ObjectSummary will have the following property:
'ContentType': 'application/x-directory'
if the key is a directory.
for s3_obj_summary in bucket.objects.all():
if s3_obj_summary.get()['ContentType'] == 'application/x-directory':
print(str(s3_obj_summary))
Upvotes: 6
Reputation: 269101
You are correct that folders do not exist. You could, for example, create a file called output/2020/01/01/foo.txt
even if none of those sub-folders exist.
Some systems, however, like to 'create' a folder by making a zero-length object with the name of the pretend folder. In such cases, you can identify the 'folder' by checking the length of the object.
Here's some sample code (using boto3 client method):
import boto3
s3 = boto3.client('s3', region_name = 'ap-southeast-2')
response = s3.list_objects_v2(Bucket='my-bucket')
for object in response['Contents']:
if object['Size'] == 0:
# Print name of zero-size object
print(object['Key'])
Officially, there is no reason for such 'folder files' to exist. Amazon S3 will work perfectly well without them (and often better, for reasons you are discovering!).
Upvotes: 5