Reputation: 6875
I am trying to invoke the Anthropic Claude Sonnet 3.5 v2 model in AWS Bedrock. Here is my code (in Python using Boto3):
import boto3
import json
bedrock_runtime = boto3.client(service_name="bedrock-runtime")
body = json.dumps({
"max_tokens": 256,
"messages": [{"role": "user", "content": "Hello, world"}],
"anthropic_version": "bedrock-2023-05-31"
})
response = bedrock_runtime.invoke_model(body=body, modelId="anthropic.claude-3-5-sonnet-20241022-v2:0")
print(response.get("body").read())
I get the error:
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.
What does this mean, and how do I invoke the model through an inference profile?
Upvotes: 0
Views: 501
Reputation: 6875
As the error says, you must provide the ID of an inference profile and not the model for this particular model. The easiest way to do this is to provide the ID of a system-defined inference profile for this model. You can find it by invoking this awscli
command with the correct credentials defined in the environment (or set via standard flags):
aws bedrock list-inference-profiles
You will see this one in the JSON list:
{
"inferenceProfileName": "US Anthropic Claude 3.5 Sonnet v2",
"description": "Routes requests to Anthropic Claude 3.5 Sonnet v2 in us-west-2, us-east-1 and us-east-2.",
"inferenceProfileArn": "arn:aws:bedrock:us-east-1:381492273274:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
"models": [
{
"modelArn": "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
},
{
"modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
},
{
"modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
}
],
"inferenceProfileId": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
"status": "ACTIVE",
"type": "SYSTEM_DEFINED"
}
Modify the invoke_model
line in your code to specify the ID or ARN of the instance profile instead:
response = bedrock_runtime.invoke_model(body=body, modelId="us.anthropic.claude-3-5-sonnet-20241022-v2:0")
Upvotes: 0