JHall651
JHall651

Reputation: 437

How to use multiple inputs for custom Tensorflow model hosted by AWS Sagemaker

I have a trained Tensorflow model that uses two inputs to make predictions. I have successfully set up and deployed the model on AWS Sagemaker.

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data='s3://' + sagemaker_session.default_bucket() 
                              + '/R2-model/R2-model.tar.gz',
                             role = role,
                             framework_version = '1.12',
                             py_version='py2',
                             entry_point='train.py')

predictor = sagemaker_model.deploy(initial_instance_count=1,
                              instance_type='ml.m4.xlarge')

predictor.predict([data_scaled_1.to_csv(),
                   data_scaled_2.to_csv()]
                 )

I always receive an error. I could use an AWS Lambda function, but I don't see any documentation on specifying multiple inputs to deployed models. Does anyone know how to do this?

Upvotes: 1

Views: 1987

Answers (3)

Slim Frikha
Slim Frikha

Reputation: 71

You need to actually build a correct signature when deploying the model first. Also, you need to deploy with tensorflow serving.

At inference, you need to also give a proper input format when requesting: basically sagemaker docker server takes the request input and passes it by to tensorflow serving. So, the input needs to match TF serving inputs.

Here is a simple example of deploying a Keras multi-input multi-output model in Tensorflow serving using Sagemaker and how to make inference afterwards:

import tarfile

from tensorflow.python.saved_model import builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants
from keras import backend as K
import sagemaker
#nano ~/.aws/config
#get_ipython().system('nano ~/.aws/config')
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import Model


def serialize_to_tf_and_dump(model, export_path):
    """
    serialize a Keras model to TF model
    :param model: compiled Keras model
    :param export_path: str, The export path contains the name and the version of the model
    :return:
    """
    # Build the Protocol Buffer SavedModel at 'export_path'
    save_model_builder = builder.SavedModelBuilder(export_path)
    # Create prediction signature to be used by TensorFlow Serving Predict API
    signature = predict_signature_def(
        inputs={
            "input_type_1": model.input[0],
            "input_type_2": model.input[1],
        },
        outputs={
            "decision_output_1": model.output[0],
            "decision_output_2": model.output[1],
            "decision_output_3": model.output[2]
        }
    )
    with K.get_session() as sess:
        # Save the meta graph and variables
        save_model_builder.add_meta_graph_and_variables(
            sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
        save_model_builder.save()

# instanciate model
model = .... 

# convert to tf model
serialize_to_tf_and_dump(model, 'model_folder/1')

# tar tf model
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('model_folder', recursive=True)

# upload it to s3
sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz')

# convert to sagemaker model
role = get_execution_role()
sagemaker_model = Model(model_data = inputs,
    name='DummyModel',
    role = role,
    framework_version = '1.12')

predictor = sagemaker_model.deploy(initial_instance_count=1,
    instance_type='ml.t2.medium', endpoint_name='MultiInputMultiOutputModel')

At inference, here is how to request for predictions:

import json
import boto3

x_inputs = ... # list with 2 np arrays of size (batch_size, ...)
data={
    'inputs':{
        "input_type_1": x[0].tolist(),
        "input_type_2": x[1].tolist()
        }
}

endpoint_name = 'MultiInputMultiOutputModel'
client = boto3.client('runtime.sagemaker')
response = client.invoke_endpoint(EndpointName=endpoint_name, Body=json.dumps(data), ContentType='application/json')
predictions = json.loads(response['Body'].read())

Upvotes: 3

Rui
Rui

Reputation: 61

Only the TF serving endpoint supports multiple inputs in one inference request. You can follow the documentation here to deploy a TFS endpoint - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst

Upvotes: 0

Olivier Cruchant
Olivier Cruchant

Reputation: 4037

You likely need to customize the inference functions loaded in the endpoints. In the SageMaker TF SDK doc here you can find that there are two options for SageMaker TensorFlow deployment:

You can diagnose error in Cloudwatch (accessible through the sagemaker endpoint UI), choose the most appropriate serving architecture among the above-mentioned two and customize the inference functions if need be

Upvotes: 0

Related Questions