koko
koko

Reputation: 199

What is the best way to use AWS Scheduler considering cost and performance

I'm working on a JAVA project which uploads files to AWS S3 bucket. Now I need to process those files in S3 (validate and send data to database) everyday at 8.00 a.m. I'm planning to use AWS scheduler for this. But I'm confuse what's the scheduler I have to use and how to use. I went through documentation and found about AWS Batch and AWS cloud watch scheduler through Lambda. But I have no idea about what's the best way to use AWS scheduler in this scenario. Not sure weather AWS Batch works for this. Actually I need to consider the cost as well. I'm glad if you could suggest me the best way to resolve this. Alternative methods are also welcome.

P.S: File process will take more than 15 mins. And also I need to config several other schedulers as well.

Upvotes: 0

Views: 902

Answers (2)

Nghia Do
Nghia Do

Reputation: 2658

My proposed solution here is

  1. Using Clouwatch Rule to trigger a lambda at 8am. (for example: SchedulerLambda)
  2. SchedulerLambda will NOT process any file, it will list files in the 'defined' location.
  3. For each of file, SchedulerLambda will send a SNS messsage to topic
  4. SNS has a SQS subscription
  5. SQS has a Lambda trigger (for example: FileProcessorLambda)
  6. FileProcessorLambda will process by a batch (max is 10). You can adjust a number of batch depends on your use-case.
  7. After FileProcessorLambda has finished a file, it will track status to DynamoDB as well. The reason for it to retry and resume at any time.

Note: The design here is to take cost, scaling, maintenance and design (loose-coupling) as priority.

Note: The assumption from here is processing a file (single files) doesn't take more than 15 minutes as limit of lambda. If a processing time of a file takes more than 15 minutes, the above solution won't work. I can give another solution if you confirm.

Upvotes: 3

raupach
raupach

Reputation: 3102

One way (there are always many with AWS) is through EventBridge formerly CloudWatch Events and AWS Lambda. I haven't worked with AWS Batch before.

Code and deploy your AWS Lambda function. In your Lambda you access the S3 bucket, validate, and send the data to the database.

If you open the AWS Console, go to your Lambda function. Next add Trigger and select EventBridge.

Now you can create a new rule. To make it run everday at 8am your Schedule Expression is cron(0 8 * * ? *)

Some things to keep in mind:

  • Don't forget a Lambda can never run longer than 15 minutes
  • Schedule Expression are in UTC and not in local time. DST is an issue.

Upvotes: 1

Related Questions