Vale Fox
Vale Fox

Reputation: 3

How to schedule a retrain of a sagemaker pipeline model using airflow

I have already implemented a sagemaker pipeline model. In particular for an end-to-end notebook that trains a model, builds a pipeline model and deploys it, I have followed this sample notebook.

Now I would like to retrain and deploy the entire pipeline every day using Airflow, but I have seen here the possibility to retrain and deploy only a single sagemaker model.

Is there a way to retrain and deploy the entire pipeline? Thanks

Upvotes: 0

Views: 1676

Answers (1)

SphericalCow
SphericalCow

Reputation: 176

SageMaker provides 2 options for users to do Airflow stuff:

  1. Use the APIs in SageMaker Python SDK to generate input of all SageMaker operators in Airflow. The blog you linked goes this way. For example, they use API training_config in SageMaker Python SDK and operator SageMakerTrainingOperator in Airflow.

  2. Use PythonOperator provided by Airflow and write Python codes to do what you want.

For 1, SageMaker only implemented APIs related to training, tuning, single model deployment and transform. Hence you are doing pipeline model, I don't think it has the API you want.

But for 2, if you can finish what you want in whatever Python codes with SageMaker. You should be able to adapt it as Python callables and make them work with PythonOperators. Here's an example for training in this way provided by SageMaker:

https://sagemaker.readthedocs.io/en/stable/using_workflow.html#using-airflow-python-operator

I think you can do similar things to make Airflow work with your pipeline model.

Upvotes: 1

Related Questions