user 98
user 98

Reputation: 177

Using python Download the latest file from s3 bucket inside folder not from inside folder --folder

I want to download the latest file from s3-bucket inside folder only. Actually inside folder there are multiple folders along with files. But i need to download only file of latest date and upload it into one folder by selecting from multiple folders. I am referring the code from stackoverflow source code.

Here is structure of s3-bucket :

  S3-Bucket : --folder_1
                  --abc2022.01.29.csv
                  --bsv2022.02.18.csv
                  --test2022.03.04.csv
                  --Folder_12
                  --Folder_13
                  --folder_14

So basically, I want to download latest file from s3-bucket inside folder (folder_1) not from inside folder folders (Folder_12,Folder_13,Folder_14).

I am getting the below error :

TypeError: 'NoneType' object is not subscriptable

Here is the code snippet using to download the latest file :

  def get_most_recent_s3_object(bucket_name, prefix)

       s3 = session.client('s3')
       paginator = s3.get_paginator( "list_objects_v2" )
       page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
       latest = None
       for page in page_iterator:
           if "Contents" in page:
               latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
               if latest is None or latest2['LastModified'] > latest['LastModified']:
                    latest = latest2
                    with open(latest, 'wb') as f:
                         s3.download_fileobj(bucket_name, latest, 'C:\\Users\xxxx\\)
      return latest
      

  latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1')
  print(latest['Key'])

But I'm not able to download the into my local path. the code is getting latest file from folders inside folders not from the s3-bucket inside folder (folder_1).

Upvotes: 2

Views: 1765

Answers (1)

user 98
user 98

Reputation: 177

I have modified the below code to download the latest file in s3-bucket inside folder and it's working fine. Please find the below working code snippet.

def get_most_recent_s3_object(bucket_name, prefix)

   s3 = session.client('s3')
   paginator = s3.get_paginator( "list_objects_v2" )
   page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix, Delimiter="/")
   latest = None
   for page in page_iterator:
       if "Contents" in page:
           latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
           if latest is None or latest2['LastModified'] > latest['LastModified']:
                latest = latest2.get('Key')
                with open(C:\\Users\xxxx\\dummy.csv', 'wb') as f:
                     s3.download_fileobj(bucket_name, latest, f)
                print('Latest file downloaded successfully....!!!')
  
  

  latest = get_most_recent_s3_object(bucket_name='bucket_name_1', prefix='folder_1/')

Upvotes: 1

Related Questions