Reputation: 6281
I am having a really strange issue. I am running this curl request, sending a simple hi
message to gpt-4
. The maximum context length of the models is 8192. hi
should be 1 token (you can check here - https://platform.openai.com/tokenizer). That means I can set max_tokens
to 8192 - 1 = 8191
.
However, OpenAI's API calculates that my messages take 8 tokens. Why?
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-bla-bla-bla" \
-d '{
"model": "gpt-4",
"max_tokens": 8191,
"messages": [
{
"role": "user",
"content": "hi"
}
]
}'
{
"error": {
"message": "This model's maximum context length is 8192 tokens. However, you requested 8199 tokens (8 in the messages, 8191 in the completion). Please reduce the length of the messages or completion.",
"type": "invalid_request_error",
"param": "messages",
"code": "context_length_exceeded"
}
}
Upvotes: 1
Views: 398
Reputation: 6281
Just found the answer here, crazy: https://community.openai.com/t/what-is-the-reason-for-adding-total-7-tokens/337002
Upvotes: 0