Asfand Qazi
Asfand Qazi

Reputation: 6875

AWS Bedrock: cannot invoke Anthropic Claude Sonnet 3.5 v2 model, raises error "Invocation of model ID with on-demand throughput isn’t supported."

I am trying to invoke the Anthropic Claude Sonnet 3.5 v2 model in AWS Bedrock. Here is my code (in Python using Boto3):

import boto3
import json

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

body = json.dumps({
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello, world"}],
    "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock_runtime.invoke_model(body=body, modelId="anthropic.claude-3-5-sonnet-20241022-v2:0")

print(response.get("body").read())

I get the error:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.

What does this mean, and how do I invoke the model through an inference profile?

Upvotes: 0

Views: 501

Answers (1)

Asfand Qazi
Asfand Qazi

Reputation: 6875

As the error says, you must provide the ID of an inference profile and not the model for this particular model. The easiest way to do this is to provide the ID of a system-defined inference profile for this model. You can find it by invoking this awscli command with the correct credentials defined in the environment (or set via standard flags):

aws bedrock list-inference-profiles

You will see this one in the JSON list:

{
  "inferenceProfileName": "US Anthropic Claude 3.5 Sonnet v2",
  "description": "Routes requests to Anthropic Claude 3.5 Sonnet v2 in us-west-2, us-east-1 and us-east-2.",
  "inferenceProfileArn": "arn:aws:bedrock:us-east-1:381492273274:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
  "models": [
    {
      "modelArn": "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    },
    {
      "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    },
    {
      "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    }
  ],
  "inferenceProfileId": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
  "status": "ACTIVE",
  "type": "SYSTEM_DEFINED"
}

Modify the invoke_model line in your code to specify the ID or ARN of the instance profile instead:

response = bedrock_runtime.invoke_model(body=body, modelId="us.anthropic.claude-3-5-sonnet-20241022-v2:0")

Upvotes: 0

Related Questions