Reputation: 7576
We're experimenting with MLOps on Azure Machine Learning, and as such, want to manage an Online Endpoint for inference. However, we also want to save costs, and since the software is only running in a single location for now, we know for a fact people won't be using it outside normal business hours.
The endpoint is deployed to a managed compute instance, which quotes us hourly (not based on requests) as long as the deployment (the endpoint) is live.
I haven't seen any option (neither on the UI nor the documentation) to schedule and delete a deployment automatically. I can configure AutoScaling, but I'm unsure scaling the endpoint to 0% in the nights and weekends also releases the compute (my guess is, it doesn't and we'd still be paying for the compute). I can delete the deployment by hand every night and deploy it every morning, but I'd expect to be able to do this automatically, as it would become unmanageable over time with more endpoints.
Can I - and if yes, how - reduce the cost to 0 USD of an Azure Machine Learning Online Endpoint outside business hours automatically based on a schedule? If yes, how?
Upvotes: 2
Views: 1562
Reputation: 14983
The recommended way to delete an Azure Machine Learning Online Endpoint using the latest Azure SDK v2 is the following (SDK1 is soon going to be deprecated):
ml_client = MLClient(
credential=AzureCliCredential(),
subscription_id=subscription_id,
resource_group_name=resource_group_name,
workspace_name=workspace_name,
)
ml_client.online_endpoints.begin_delete(name=endpoint_name).result()
logging.info(f"Endpoint '{endpoint_name}' deleted successfully.")
Now, the only way you can schedule this, is to create a pipeline with SDK2, whose aim is to delete a specific/specific endpoint names.
Adapted to your use case, this would translate to having a pipeline delete + recreate an endpoint at a given time of the day.
You can have a look on scheduling pipelines (in your case a pipeline with one job, one job which should delete or create): https://learn.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipeline-job?view=azureml-api-2&tabs=python
Upvotes: 0
Reputation: 90
You can delete a endpoint using the python azure ml sdk
from azureml.core import Workspace, Webservice
service = Webservice(workspace=ws, name='your-service-name')
service.delete()
Then if you want to re create you can re deploy the model
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
from azureml.core.model import Model
service_name = 'my-custom-env-service'
inference_config = InferenceConfig(entry_script='score.py', environment=environment)
aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
service = Model.deploy(workspace=ws,
name=service_name,
models=[model],
inference_config=inference_config,
deployment_config=aci_config,
overwrite=True)
service.wait_for_deployment(show_output=True)
There is no current way to schedule or temporary disable the endpoint. The only way would be to delete and re create using the azureml sdk. The other option would be to use a Azure function app for deployment for ml models and this way you only pay for requests made.
Upvotes: 1
Reputation: 9
I suppose you are using an 'Online' endpoint to deploy. If it is, one possibility is to configure the deployment to use a 'Batch' endpoint, which allows the computing scale to be zero.
You can see more about it at: https://learn.microsoft.com/en-us/azure/machine-learning/concept-endpoints?view=Azureml-api-2#deplayments.
Upvotes: -1