Ric S
Ric S

Reputation: 9267

Query a specific model deployment in an endpoint inside Azure Machine Learning

I'm working in Python. I already successfully deployed two model deployments in an Azure Machine Learning real-time endpoint, with a 50-50 traffic split.

Using the Python SDK, I know that to call a specific model deployment I have to specify the deployment_name parameter in the invoke method (tutorial and docs).

However, if I want to do the same operation using REST APIs, I did not find any parameter to be passed to the REST endpoint query string. Therefore, the model deployment is randomly chosen according to the traffic split.

Code example:

import requests
import json

endpoint = "https://my-endpoint.westeurope.inference.ml.azure.com/score"
api_key = "xyz"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}"
}

data = {
    "input_data": {
        "columns": [
            "a",
            "b"
            ],
        "index": [0, 1],
        "data": [
            [0.00, 185.0],
            [18.00, 181.0]
        ]
    }
}

response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
print(response.json())

Since there is the invoke method for the Python SDK, I expected there was a parameter to be added to the endpoint query string, e.g. endpoint = https://my-endpoint.westeurope.inference.ml.azure.com/score?deployment=model1 - I appended deployment=model1 at the end - but I did not find anything in the docs.

Any suggestions? Does anyone know if this option is available?

Upvotes: 0

Views: 127

Answers (1)

JayashankarGS
JayashankarGS

Reputation: 8160

According to this documentation you need to use the header azureml-model-deployment and it's value.

Alter you code like below.

headers = {
    "Content-Type": "application/json",
    "azureml-model-deployment":"<your-deployment-name>",
    "Authorization": f"Bearer {api_key}"
}

data = {
    "input_data": {
        "columns": [
            "a",
            "b"
            ],
        "index": [0, 1],
        "data": [
            [0.00, 185.0],
            [18.00, 181.0]
        ]
    }
}

response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
print(response.json())

Output for different deployments.

Deployment blue

enter image description here

Deployment red

enter image description here

Gives deployment not found, because i haven't done deployment red.

Upvotes: 1

Related Questions