Reputation: 1886
I have my model hosted on ACI compute. I'm trying to investigate what it would take to support auto-scaling of the underlying instances? If auto scaling isnt possible, then is there documentation to manually scale the endpoint?
Basically, I need to support high availability on this model endpoint.
A thought that I had was to manually publish the model to 2 endpoints and then add a Load Balander in front. Seems a little hacky...
Thanks!
Upvotes: 2
Views: 526
Reputation: 344
We usually recommend deploying to AKS for high availability. https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service
Upvotes: 1