edward-loc
edward-loc

Reputation: 15

Azure ML Inference Schema - "List index out of range" error

I have an ML model deployed on Azure ML Studio and I was updating it with an inference schema to allow compatibility with Power BI as described here.

When sending data up to the model via REST api (before adding this inference schema), everything works fine and I get results returned. However, once adding the schema as described in the instructions linked above and personalising to my data, the same data sent via REST api only returns the error "list index out of range". The deployment goes ahead fine and is designated as "healthy" with no error messages.

Any help would be greatly appreciated. Thanks.

EDIT:

Entry script:

 import numpy as np
 import pandas as pd
 import joblib
 from azureml.core.model import Model
    
 from inference_schema.schema_decorators import input_schema, output_schema
 from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
 from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
 from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
    
 def init():
     global model
     #Model name is the name of the model registered under the workspace
     model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
     model = joblib.load(model_path)
    
 #Provide 3 sample inputs for schema generation for 2 rows of data
 numpy_sample_input = NumpyParameterType(np.array([[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]], dtype = 'float64'))
 pandas_sample_input = PandasParameterType(pd.DataFrame({'1': [2400.0, 368.55], '2': [78.26086956521739, 96.88311688311687], '3': [11100.0, 709681.1600000012], '4': [3.612565445026178, 73.88059701492537], '5': [3.0, 44.0], '6': [0.0, 0.0]}))
 standard_sample_input = StandardPythonParameterType(0.0)
    
 # This is a nested input sample, any item wrapped by `ParameterType` will be described by schema
 sample_input = StandardPythonParameterType({'input1': numpy_sample_input, 
                                             'input2': pandas_sample_input, 
                                             'input3': standard_sample_input})
    
 sample_global_parameters = StandardPythonParameterType(1.0) #this is optional
 sample_output = StandardPythonParameterType([1.0, 1.0])
    
 @input_schema('inputs', sample_input)
 @input_schema('global_parameters', sample_global_parameters) #this is optional
 @output_schema(sample_output)
    
 def run(inputs, global_parameters):
     try:
         data = inputs['input1']
         # data will be convert to target format
         assert isinstance(data, np.ndarray)
         result = model.predict(data)
         return result.tolist()
     except Exception as e:
         error = str(e)
         return error

Prediction script:

 import requests
 import json
 from ast import literal_eval
    
 # URL for the web service
 scoring_uri = ''
 ## If the service is authenticated, set the key or token
 #key = '<your key or token>'
    
 # Two sets of data to score, so we get two results back
 data = {"data": [[2400.0, 78.26086956521739, 11100.0, 3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 73.88059701492537, 44.0, 0.0]]}
 # Convert to JSON string
 input_data = json.dumps(data)
    
 # Set the content type
 headers = {'Content-Type': 'application/json'}
 ## If authentication is enabled, set the authorization header
 #headers['Authorization'] = f'Bearer {key}'
    
 # Make the request and display the response
 resp = requests.post(scoring_uri, input_data, headers=headers)
 print(resp.text)
    
 result = literal_eval(resp.text)

Upvotes: 1

Views: 958

Answers (2)

Hern&#225;n Quiroz
Hern&#225;n Quiroz

Reputation: 36

The Microsoft documentation say's: "In order to generate conforming swagger for automated web service consumption, scoring script run() function must have API shape of:

A first parameter of type "StandardPythonParameterType", named Inputs and nested.

An optional second parameter of type "StandardPythonParameterType", named GlobalParameters.

Return a dictionary of type "StandardPythonParameterType" named Results and nested."

I've already test this and it is case sensitive So it will be like this:

import numpy as np
import pandas as pd
import joblib

from azureml.core.model import Model
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.standard_py_parameter_type import 
    StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType

def init():
    global model
    # Model name is the name of the model registered under the workspace
    model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
    model = joblib.load(model_path)

# Provide 3 sample inputs for schema generation for 2 rows of data
numpy_sample_input = NumpyParameterType(np.array([[2400.0, 78.26086956521739, 11100.0, 
3.612565445026178, 3.0, 0.0], [368.55, 96.88311688311687, 709681.1600000012, 
73.88059701492537, 44.0, 0.0]], dtype = 'float64'))

pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0, 368.55], 
'delayed_percent': [78.26086956521739, 96.88311688311687], 'total_value_delayed': 
[11100.0, 709681.1600000012], 'num_invoices_per30_dealing_days': [3.612565445026178, 
73.88059701492537], 'delayed_streak': [3.0, 44.0], 'prompt_streak': [0.0, 0.0]}))

standard_sample_input = StandardPythonParameterType(0.0)

# This is a nested input sample, any item wrapped by `ParameterType` will be described 
by schema
sample_input = StandardPythonParameterType({'input1': numpy_sample_input, 
                                         'input2': pandas_sample_input, 
                                         'input3': standard_sample_input})

sample_global_parameters = StandardPythonParameterType(1.0) #this is optional

numpy_sample_output = NumpyParameterType(np.array([1.0, 2.0]))

# 'Results' is case sensitive
sample_output = StandardPythonParameterType({'Results': numpy_sample_output})

# 'Inputs' is case sensitive
@input_schema('Inputs', sample_input)
@input_schema('global_parameters', sample_global_parameters) #this is optional
@output_schema(sample_output)
def run(Inputs, global_parameters):
    try:
        data = inputs['input1']
        # data will be convert to target format
        assert isinstance(data, np.ndarray)
        result = model.predict(data)
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

`

Upvotes: 1

carrigwat
carrigwat

Reputation: 11

I'm not sure if you've figured it out yet or not but I was having similar issues, and I couldn't get Power BI to see my ML model. In the end I just created a service specifically for Power BI (pandas df type) using the following schema:

import json
import pandas as pd
import numpy as np
import os
import joblib
from sklearn.ensemble import RandomForestClassifier

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType

import pickle
import azureml.train.automl


# Called when the service is loaded
def init():
    # AZUREML_MODEL_DIR is an environment variable created during deployment. Join this path with the filename of the model file.
    # It holds the path to the directory that contains the deployed model (./azureml-models/$MODEL_NAME/$VERSION).
    # If there are multiple models, this value is the path to the directory containing all deployed models (./azureml-models).
    global model

    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')

    # Get the path to the deployed model file and load it
    # Deserialize the model file back into a sklearn model
    model = joblib.load(model_path)


input_sample = PandasParameterType(pd.DataFrame({
    'input1': [0.0, 20.0],
    'input2': [0.0, 20.0], 
    'input2': [0.0, 20.0]
    }))


output_sample = PandasParameterType(pd.DataFrame([0.8, 0.2]))

# Called when a request is received
@input_schema('data', input_sample)
@output_schema(output_sample)
def run(data):
    try:
        result = model.predict(data)
        # You can return any data type, as long as it is JSON serializable.
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

Upvotes: 1

Related Questions