dairy
dairy

Reputation: 181

Create multi-message conversations with the GPT API

I am experimenting with the GPT API by OpenAI and am learning how to use the GPT-3.5-Turbo model. I found a quickstart example on the web:

def generate_chat_completion(messages, model="gpt-3.5-turbo", temperature=1, max_tokens=None):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_KEY}",
    }

    data = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
    }

    max_tokens = 100

    if max_tokens is not None:
        data["max_tokens"] = max_tokens

    response = requests.post(API_ENDPOINT, headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    else:
        raise Exception(f"Error {response.status_code}: {response.text}")

while 1:
    inputText = input("Enter your message: ")

    messages = [
        {"role": "system", "content": inputText},
    ]

    response_text = generate_chat_completion(messages)
    print(response_text)

With the necessary imports and the API key and endpoint defined above the code block. I added the inputText variable to take text inputs and an infinite while loop to keep the input/response cycle going until the program is terminated (probably bad practice).

However, I've noticed that responses from the API aren't able to reference previous parts of the conversation like the ChatGPT web application (rightfully so, as I have not mentioned any form of conversation object). I looked up on the API documentation on chat completion and the conversation request example is as follows:

[
  {"role": "system", "content": "You are a helpful assistant that translates English to French."},
  {"role": "user", "content": 'Translate the following English text to French: "{text}"'}
]

However, this means I will have to send all the inputted messages into the conversation at once and get a response back for each of them. I cannot seem to find a way (at least as described in the API) to send a message, then get one back, and then send another message in the format of a full conversation with reference to previous messages like a chatbot (or as described before the ChatGPT app). Is there some way to implement this?

Also: the above does not use the OpenAI Python module. It uses the Requests and JSON modules.

Upvotes: 7

Views: 15096

Answers (4)

Tiago Gouvêa
Tiago Gouvêa

Reputation: 16790

Another approach, to make history shorter and spend a little more money, is to create message summaries occasionally.

For example: when you reach 18 messages, you call a completion to summarize the 6 first messages from the conversation, in just one sentence/paragraph.

After that, instead of sending the first 6 messages, you send just the previous conversation summary through the role:system message.

This is a vastly used approach using the chat completion API.

Another solution is to use the Assistant API, so you don't need to keep the history by yourself, OpenAi will take care of it, but, it will charge the full message history as well, and you won't be able to summarize and make the past messages short.

Google announced last month that Gemini will have a context cache, so you won't need to send the full history on every interaction, and it will charge this cache differently, it will be cheaper. Let's wait to see!

Upvotes: 0

Just send the entire message history whenever you request a response through a session:

messages = [
    {"role": "system", "content": system_message},
] + session['chat_history'] + [{"role": "user", "content": user_message}]

response =client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages
)

    bot_response = response.choices[0].message['content']

Upvotes: 2

stevec
stevec

Reputation: 52768

Here's a very nice example; basically just send the entire message history (along with the 'role' of each message):

import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

# The 2020 World Series was played at Globe Life Field in Arlington, Texas.

Example and more info in the GPT API docs.

Upvotes: 6

macasas
macasas

Reputation: 582

I saw your question and was hoping to see some answers, because I have a similar issue of how many previous messages are required, because eventually all those messages will add up and go over the limits. Alas.

In your case, the response_text that comes back is actually a list of choices and you can extract the response, and then add it to the messages array, which builds to become your step by step conversation. The API docs example is the starting point for that. It's an array of messages, keep adding. How big you allow this to grow is your next question, and no doubt it will add to the cost of tokens as well.

Upvotes: 1

Related Questions