Reputation: 189
When creating a chat app using chatgpt-3.5-turbo model. Does the API consider the whole tokens (including the assistant messages and old set of messages) in billing or just the last message from the user is counted in billing whenever I resend the API request with a new message appended to the conversation?
For eg:
messages = [
{"role": "system", "content": "You are a kind helpful assistant."},
]
while True:
message = input("User : ")
if message:
messages.append(
{"role": "user", "content": message},
)
chat = openai.ChatCompletion.create(
model="gpt-3.5-turbo", messages=messages
)
reply = chat.choices[0].message.content
print(f"ChatGPT: {reply}")
messages.append({"role": "assistant", "content": reply})
Upvotes: 4
Views: 2920
Reputation: 1497
As mentioned in OpenAI document:
The total number of tokens in an API call affects how much your API call costs, as you pay per token
Both input and output tokens count toward these quantities. For example, if your API call used 10 tokens in the message input and you received 20 tokens in the message output, you would be billed for 30 tokens.
To see how many tokens are used by an API call, check the usage field in the API response
response['usage']['total_tokens']
Each time you append
previous chats to messages
, the number of total_token
will increases. So all tokens of previous messages will be considered in the bill.
Upvotes: 3