Reputation: 177
I want to download the latest file from s3-bucket inside folder only. Actually inside folder there are multiple folders along with files. But i need to download only file of latest date and upload it into one folder by selecting from multiple folders. I am referring the code from stackoverflow source code.
Here is structure of s3-bucket :
S3-Bucket : --folder_1
--abc2022.01.29.csv
--bsv2022.02.18.csv
--test2022.03.04.csv
--Folder_12
--Folder_13
--folder_14
So basically, I want to download latest file from s3-bucket inside folder (folder_1) not from inside folder folders (Folder_12,Folder_13,Folder_14).
I am getting the below error :
TypeError: 'NoneType' object is not subscriptable
Here is the code snippet using to download the latest file :
def get_most_recent_s3_object(bucket_name, prefix)
s3 = session.client('s3')
paginator = s3.get_paginator( "list_objects_v2" )
page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
latest = None
for page in page_iterator:
if "Contents" in page:
latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
if latest is None or latest2['LastModified'] > latest['LastModified']:
latest = latest2
with open(latest, 'wb') as f:
s3.download_fileobj(bucket_name, latest, 'C:\\Users\xxxx\\)
return latest
latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1')
print(latest['Key'])
But I'm not able to download the into my local path. the code is getting latest file from folders inside folders not from the s3-bucket inside folder (folder_1).
Upvotes: 2
Views: 1765
Reputation: 177
I have modified the below code to download the latest file in s3-bucket inside folder and it's working fine. Please find the below working code snippet.
def get_most_recent_s3_object(bucket_name, prefix)
s3 = session.client('s3')
paginator = s3.get_paginator( "list_objects_v2" )
page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
latest = None
for page in page_iterator:
if "Contents" in page:
latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
if latest is None or latest2['LastModified'] > latest['LastModified']:
latest = latest2.get('Key')
with open(C:\\Users\xxxx\\dummy.csv', 'wb') as f:
s3.download_fileobj(bucket_name, latest, f)
print('Latest file downloaded successfully....!!!')
latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1/')
Upvotes: 1