OpenAI API max_tokens counting

Question

I am having a really strange issue. I am running this curl request, sending a simple hi message to gpt-4. The maximum context length of the models is 8192. hi should be 1 token (you can check here - https://platform.openai.com/tokenizer). That means I can set max_tokens to 8192 - 1 = 8191.

However, OpenAI's API calculates that my messages take 8 tokens. Why?

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-bla-bla-bla" \
  -d '{
    "model": "gpt-4",
    "max_tokens": 8191,
    "messages": [
      {
        "role": "user",
        "content": "hi"
      }
    ]
  }'
{
  "error": {
    "message": "This model's maximum context length is 8192 tokens. However, you requested 8199 tokens (8 in the messages, 8191 in the completion). Please reduce the length of the messages or completion.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

OpenAI API max_tokens counting

Answers (1)

Related Questions