Reputation: 83
I'm designing the architecture of my ETL for an ML related project.
Generally when automating tasks in AWS I use EventBridge, SQS, S3 as triggers if using Lambdas. In my case Lambdas doesn't fit my needs so I decided to go with Amazon Sagemaker Processing Jobs. These processors could scale with instance type and number of instances, also they have an expiration time of 24 hours, this are the requirements that Lambda couldn't achieve.
The architecture that I generally use:
But as you can imaging that Lambda layer is only used for launching Sagemaker Processing Jobs so it is desirable to avoid it.
Questions:
Q1. There is a better architecture with AWS services in order to automate/trigger/schedule Sagemaker Processing Jobs?
Q2. Which kind of services could perform better this task?
Upvotes: 0
Views: 3007
Reputation: 1152
You can use SageMaker pipelines to trigger a SageMaker processing job, it has native integration with EventBridge (for example, trigger pipeline if an object is uploaded to a specific S3 bucket). See the integration here - https://docs.aws.amazon.com/sagemaker/latest/dg/pipeline-eventbridge.html
Here are a few samples to get started with pipelines - https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-pipelines/index.html. You only pay for the jobs executed in the pipeline, not for the orchestration itself.
Upvotes: 2