Steven Chan
Steven Chan

Reputation: 473

Invoke sagemaker endpoint with custom inference script

I've deployed a sagemaker endpoint using the following code:

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role, Session

sess = Session()
role = get_execution_role()

model = PyTorchModel(model_data=my_trained_model_location,
                     role=role,
                     sagemaker_session=sess,
                     framework_version='1.5.0',
                     entry_point='inference.py',
                     source_dir='.')

predictor = model.deploy(initial_instance_count=1, 
                         instance_type='ml.m4.xlarge',
                         endpoint_name='my_endpoint')

If I run:

import numpy as np

pseudo_data = [np.random.randn(1, 300), np.random.randn(6, 300), np.random.randn(3, 300), np.random.randn(7, 300), np.random.randn(5, 300)] # input data is a list of 2D numpy arrays with variable first dimension and fixed second dimension
result = predictor.predict(pseudo_data)

I can generate the result with no errors. However, if I want to invoke the endpoint and make prediction by running:

from sagemaker.predictor import RealTimePredictor

predictor = RealTimePredictor(endpoint='my_endpoint')
result = predictor.predict(pseudo_data)

I'd get an error:

Traceback (most recent call last):
  File "default_local.py", line 77, in <module>
    score = predictor.predict(input_data)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/sagemaker/predictor.py", line 113, in predict
    response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 608, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 656, in _convert_to_request_dict
    api_params, operation_model)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/validate.py", line 297, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body

From my understanding, the error occurs because I didn't pass in inference.py as the entry point file, which is required to handle the input since it's not in a standard format supported by Sagemaker. However, sagemaker.predictor.RealTimePredictor doesn't allow me to define the entry point file. How can I solve this?

Upvotes: 3

Views: 914

Answers (1)

Yoav Zimmerman
Yoav Zimmerman

Reputation: 608

The error you're seeing is raised from the clientside SageMaker Python SDK library, not the remote endpoint that you have published.

Here is the documentation for the data argument (in your case, this is pseudo_data)

data (object) – Input data for which you want the model to provide inference. If a serializer was specified when creating the RealTimePredictor, the result of the serializer is sent as input data. Otherwise the data must be sequence of bytes, and the predict method then sends the bytes in the request body as is.

Source: https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor.predict

My guess is that pseudo_data is not the type that the SageMaker Python SDK is expecting, which is a sequence of bytes.

Upvotes: 0

Related Questions