Dipankar Ghosh
Dipankar Ghosh

Reputation: 135

Amazon SageMaker: Invoke endpoint with file as multipart/form-data

After setting up an endpoint for my model on Amazon SageMaker, I am trying to invoke it with a POST request which contains a file with a key image & content type as multipart/form-data.

My AWS CLI command is like this:

aws sagemaker-runtime invoke-endpoint --endpoint-name <endpoint-name> --body image=@/local/file/path/dummy.jpg --content-type multipart/form-data output.json --region us-east-1

which should be an equivalent of:

curl -X POST -F "image=@/local/file/path/dummy.jpg" http://<endpoint>

After running the aws command, the file is not transferred via the request, and my model is receiving the request without any file in it.

Can someone please tell me what should be the correct format of the aws command in order to achieve this?

Upvotes: 9

Views: 4003

Answers (2)

Meir Pechthalt
Meir Pechthalt

Reputation: 313

For anyone landing here and is OK to do it with python, here is what I found.

Attribution: https://betatim.github.io/posts/python-create-multipart-formdata/

Send Form Data with a file to sagemaker endpoint with python

Use urllib3.encode_multipart_formdata

Example:

import boto3
import sagemaker
from sagemaker.predictor import Predictor
from urllib3 import encode_multipart_formdata

ENDPOINT_NAME = "example-endpoint"
IMAGE_PATH = "examples/image.jpg"
KEY = "image"
FILENAME = "image.jpg"

session = boto3.Session()
sagemaker_session = sagemaker.Session(boto_session=session)
predictor = Predictor(ENDPOINT_NAME, sagemaker_session)
data = {
  KEY: (FILENAME, open(IMAGE_PATH, "rb").read(), "image/jpg")
}

body, header = encode_multipart_formdata(data)
inference_response = predictor.predict(body, initial_args={"ContentType": header})

print(inference_response)

Upvotes: 1

tgoodhart
tgoodhart

Reputation: 3266

The first problem is that you're using 'http' for your CURL request. Virtually all AWS services strictly use 'https' as their protocol, SageMaker included. https://docs.aws.amazon.com/general/latest/gr/rande.html. I'm going to assume this was a typo though.

You can check the verbose output of the AWS CLI by passing the '--debug' argument to your call. I re-ran a similar experiment with my favorite duck.jpg image:

aws --debug sagemaker-runtime invoke-endpoint --endpoint-name MyEndpoint --body image=@/duck.jpg --content-type multipart/form-data  >(cat)

Looking at the output, I see:

2018-08-10 08:42:20,870 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=InvokeEndpoint) (verify_ssl=True) with params: {'body': 'image=@/duck.jpg', 'url': u'https://sagemaker.us-west-2.amazonaws.com/endpoints/MyEndpoint/invocations', 'headers': {u'Content-Type': 'multipart/form-data', 'User-Agent': 'aws-cli/1.15.14 Python/2.7.10 Darwin/16.7.0 botocore/1.10.14'}, 'context': {'auth_type': None, 'client_region': 'us-west-2', 'has_streaming_input': True, 'client_config': <botocore.config.Config object at 0x109a58ed0>}, 'query_string': {}, 'url_path': u'/endpoints/MyEndpoint/invocations', 'method': u'POST'}

It looks like the AWS CLI is using the string literal '@/duck.jpg', not the file contents.

Trying again with curl and the "--verbose" flag:

curl --verbose -X POST -F "image=@/duck.jpg" https://sagemaker.us-west-2.amazonaws.com/endpoints/MyEndpoint/invocations

I see the following:

Content-Length: 63097

Much better. The '@' operator is a CURL specific feature. The AWS CLI does have a way to pass files though:

--body fileb:///duck.jpg

There is also a 'file' for non-binary files such as JSON. Unfortunately you cannot have the prefix. That is, you cannot say:

 --body image=fileb:///duck.jpg

You can prepend the string 'image=' to your file with a command such as the following. (You'll probably need to be more clever if your images are really big; this is really inefficient.)

 echo -e "image=$(cat /duck.jpg)" > duck_with_prefix

Your final command would then be:

 aws sagemaker-runtime invoke-endpoint --endpoint-name MyEndpoint --body fileb:///duck_with_prefix --content-type multipart/form-data  >(cat)

Another note: Using raw curl with AWS services is extremely difficult due to the AWS Auth signing requirements - https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html

It can be done, but you'll likely be more productive by using the AWS CLI or a pre-existing tool such as Postman - https://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-use-postman-to-call-api.html

Upvotes: 5

Related Questions