Reputation: 19
I finetuned llama 3.1 Instruct 8B on Sagemaker Jumpstart and running inference using the endpoint. When I don't give a context (documents from knowledge base) I don't get any error but when I provide a context I am getting this error:
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Connection reset by peer for the llama-3-1-8b-instruct-2024-10-31-15-14-54-211 endpoint. Please retry."
In the cloudwatch, there is not much detailed information. It says:
[WARN ] InferenceRequestHandler - Chunk reading interrupted
java.lang.IllegalStateException: Read chunk timeout.
Any help would be appreciated!
Upvotes: 1
Views: 59