AnalyticsPy
AnalyticsPy

Reputation: 245

How to pick up file name dynamically while uploading the file in S3 with Python?

I am working on a requirement where I have to save logs of my ETL scripts to S3 location.

For this I am able to store the logs in my local system and now I need to upload them in S3.

For this I have written following code-

import logging
import datetime
import boto3
from boto3.s3.transfer import S3Transfer
from etl import CONFIG

FORMAT = '%(asctime)s [%(levelname)s] %(filename)s:%(lineno)s % 
(funcName)s() : %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logger = logging.getLogger()
logger.setLevel(logging.INFO)
S3_DOMAIN = 'https://s3-ap-southeast-1.amazonaws.com'
S3_BUCKET = CONFIG['S3_BUCKET']
filepath = ''
folder_name = 'etl_log'
filename = ''


def log_file_conf(merchant_name, table_name):
  log_filename = datetime.datetime.now().strftime('%Y-%m-%dT%H-%M-%S') + 
  '_' + table_name + '.log'
  fh = logging.FileHandler("E:/test/etl_log/" + merchant_name + "/" 
  + log_filename)
  fh.setLevel(logging.DEBUG)
  fh.setFormatter(logging.Formatter(FORMAT, DATETIME_FORMAT))
  logger.addHandler(fh)

client = boto3.client('s3', 
aws_access_key_id=CONFIG['S3_KEY'],
aws_secret_access_key=CONFIG['S3_SECRET'])
transfer = S3Transfer(client)
transfer.upload_file(filepath, S3_BUCKET, folder_name+"/"+filename)

Issue I am facing here is that logs are generated for different merchants hence their names are based on the merchant and this I have taken cared while saving on local.

But for uploading in S3 I don't know how to select log file name.

Can anyone please help me to achieve my goal?

Upvotes: 0

Views: 1372

Answers (1)

mootmoot
mootmoot

Reputation: 13176

s3 is an object store, it doesn't have "real path", the so call path e.g. "/" separator is actually cosmetic. So nothing prevent you from using something similar to your local file naming convention. e.g.

transfer.upload_file(filepath, S3_BUCKET, folder_name+"/" + merchant_name + "/" + filename)

To list all the file under the arbitrary path (it is called "prefix") , you just do this

# simple list object, not handling pagination. max 1000 objects listed
client.list_objects( 
  Bucket = S3_BUCKET, 
  Prefix = folder_name + "/" + merchant_name
)

Upvotes: 1

Related Questions