Invoke sagemaker endpoint with custom inference script

Question

I've deployed a sagemaker endpoint using the following code:

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role, Session

sess = Session()
role = get_execution_role()

model = PyTorchModel(model_data=my_trained_model_location,
                     role=role,
                     sagemaker_session=sess,
                     framework_version='1.5.0',
                     entry_point='inference.py',
                     source_dir='.')

predictor = model.deploy(initial_instance_count=1, 
                         instance_type='ml.m4.xlarge',
                         endpoint_name='my_endpoint')

If I run:

import numpy as np

pseudo_data = [np.random.randn(1, 300), np.random.randn(6, 300), np.random.randn(3, 300), np.random.randn(7, 300), np.random.randn(5, 300)] # input data is a list of 2D numpy arrays with variable first dimension and fixed second dimension
result = predictor.predict(pseudo_data)

I can generate the result with no errors. However, if I want to invoke the endpoint and make prediction by running:

from sagemaker.predictor import RealTimePredictor

predictor = RealTimePredictor(endpoint='my_endpoint')
result = predictor.predict(pseudo_data)

I'd get an error:

Traceback (most recent call last):
  File "default_local.py", line 77, in 
    score = predictor.predict(input_data)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/sagemaker/predictor.py", line 113, in predict
    response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 608, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 656, in _convert_to_request_dict
    api_params, operation_model)
  File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/validate.py", line 297, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body

From my understanding, the error occurs because I didn't pass in inference.py as the entry point file, which is required to handle the input since it's not in a standard format supported by Sagemaker. However, sagemaker.predictor.RealTimePredictor doesn't allow me to define the entry point file. How can I solve this?

Invoke sagemaker endpoint with custom inference script

Answers (1)

Related Questions