Reputation: 41
I been looking around at various post about deploying SageMaker models locally, but they have to be tied to an AWS notebook instances in order to run predict/serve locally (AWS SageMaker Python SDK). This defeats the actual intent of running the Sagemaker trained model fully offline. Also there are some others who tried unpickling the tar.gz file on S3, followed by wrapping the contents to be deployed locally. However the process seems to be very restricted to certain types of models such as XGBoost and MXnet. Hence is there any way to deploy a SageMaker trained model offline without dependency to a Sagemaker notebook instance? Any form of advice would be appreciated. Thank you.
Upvotes: 4
Views: 6367
Reputation: 33
I've deployed PyTorch models locally via Amazon SageMaker Local Mode. I believe the same process works for other ML frameworks that have official SageMaker containers. You can run the same Docker containers locally that SageMaker uses when deploying your model on AWS infrastructure.
The docs for deploying a Sagemaker endpoint locally for inference are a bit scattered. A summary:
botocore.client.SageMaker
and botocore.client.SageMakerRuntime
classes to use SageMaker from Python. To use SageMaker locally, use sagemaker.local.LocalSagemakerClient()
and sagemaker.local.LocalSagemakerRuntimeClient()
instead.tar.gz
model file if you wish.instance_type
to local
when deploying the model.I wrote How to setup a local AWS SageMaker environment for PyTorch, which goes in detail on how this works.
Upvotes: 1
Reputation: 5578
Once you have trained a model using Amazon SageMaker you'll have a Model entry. The model will point to a model artifact in S3. This tag.gz file has the model weights. The format of the file depends on the framework (tensorflow/pytorch/mxnet/...) you've used to train the model. If you've used SageMaker built-in algorithms, most of them are implemented with MXNet, or XGBoost, so you could use the relevant model serving software to run the model.
If you need serving software, you could run the SageMaker deeplearning containers in inference mode, on your local inference server. Or use open-source serving software like TFServing, or load the model in-memory.
Upvotes: 0