Reputation: 163
I am building a time series usecase to automate the preprocess and retrain tasks.At first the data is preprocessed using numpy, pandas, statsmodels etc & later a machine learning algorithm is applied to make predictions. The reason for using inference pipeline is that it reuses the same preprocess code for training and inference. I have checked the examples given by AWS sagemaker team with spark and sci-kit learn. In both the examples they use a sci-kit learn container to fit & transform their preprocess code. Should I also have to create a container which is not needed in my use case as I am not using any sci-kit-learn code?
Can someone give me a custom example of using these pipelines? Any help is appreciated!
Sources looked into:
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/scikit_learn_inference_pipeline https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/inference_pipeline_sparkml_blazingtext_dbpedia
Upvotes: 0
Views: 1764
Reputation: 384
Apologies for the late response.
Below is some documentation on inference pipelines: https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
Should I also have to create a container which is not needed in my use case as I am not using any sci-kit-learn code?
Your container is an encapsulation of the environment needed for your custom code needed to run properly. Based on the requirements listed above, numpy, pandas, statsmodels etc & later a machine learning algorithm
, I would create a container if you wish to isolate your dependencies or modify an existing predefined SageMaker container, such as the scikit-learn one, and add your dependencies into that.
Can someone give me a custom example of using these pipelines? Any help is appreciated!
Unfortunately, the two example notebooks referenced above are the only examples utilizing inference pipelines. The biggest hurdle most likely is creating containers that fulfill the preprocessing and prediction task you are seeking and then combining those two together into the inference pipeline.
Upvotes: 1