Invoke endpoint after model deployment : [Err 104] Connection reset by peer

Question

I am new to Sagemaker. I have deployed my well trained model in tensorflow by using Json and Weight file. But it is strange that in my note book, I didn't see it says "Endpoint successfully built". Only the below is shown:

--------------------------------------------------------------------------------!

Instead, I found the endpoint number from my console.

import sagemaker
from sagemaker.tensorflow.model import TensorFlowModel
        predictor=sagemaker.tensorflow.model.TensorFlowPredictor(endpoint_name, sagemaker_session)
data= test_out2
predictor.predict(data)

Then I try to invoke the endpoint by using 2D array: (1) If my 2D array is in size of (5000, 170), I am getting the error:

ConnectionResetError: [Errno 104] Connection reset by peer

(2) If reducing the array to size of (10,170), error is :

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "". See https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-2019-04-28-XXXXXXXXX in account 15XXXXXXXX for more information.

Any suggestion please? Found similar case in github, https://github.com/awslabs/amazon-sagemaker-examples/issues/589.

Is it the similar case please?

Thank you very much in advance!

SphericalCow · Accepted Answer

The first error with data size (5000, 170) might be due to a capacity issue. SageMaker endpoint prediction has a size limit of 5mb. So if your data is larger than 5mb, you need to chop it into pieces and call predict multiple times.

For the second error with data size (10, 170), the error message asks you to look into logs. Did you find anything interesting in the cloudwatch log? Anything can be shared in this question?

Invoke endpoint after model deployment : [Err 104] Connection reset by peer

Answers (2)

Related Questions