Yavor from Team-GPT
Yavor from Team-GPT

Reputation: 6281

OpenAI API max_tokens counting

I am having a really strange issue. I am running this curl request, sending a simple hi message to gpt-4. The maximum context length of the models is 8192. hi should be 1 token (you can check here - https://platform.openai.com/tokenizer). That means I can set max_tokens to 8192 - 1 = 8191.

However, OpenAI's API calculates that my messages take 8 tokens. Why?

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-bla-bla-bla" \
  -d '{
    "model": "gpt-4",
    "max_tokens": 8191,
    "messages": [
      {
        "role": "user",
        "content": "hi"
      }
    ]
  }'
{
  "error": {
    "message": "This model's maximum context length is 8192 tokens. However, you requested 8199 tokens (8 in the messages, 8191 in the completion). Please reduce the length of the messages or completion.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

Upvotes: 1

Views: 398

Answers (1)

Related Questions