Duy Bui
Duy Bui

Reputation: 1396

GPT-3 davinci gives different results with the same prompt

I am not sure if you have access to GPT-3, particularly DaVinci (the complete-a-sentence tool). You can find the API and info here

I've been trying this tool for the past hour and every time I hit their API using the same prompt (indeed the same input), I received a different response.

  1. Do you happen to encounter the same situation?
  2. If this is expected, do you happen to know the reason behind it?

Here are some examples

Request header (I tried to use the same example they provide)

{
  "prompt": "Once upon a time",
  "max_tokens": 3,
  "temperature": 1,
  "top_p": 1,
  "n": 1,
  "stream": false,
  "logprobs": null,
  "stop": "\n"
}

Output 1

"choices": [
        {
            "text": ", this column",
            "index": 0,
            "logprobs": null,
            "finish_reason": "length"
        }
    ]

Output 2

"choices": [
        {
            "text": ", winter break",
            "index": 0,
            "logprobs": null,
            "finish_reason": "length"
        }
    ]

Output 3

"choices": [
        {
            "text": ", the traditional",
            "index": 0,
            "logprobs": null,
            "finish_reason": "length"
        }
    ]

Upvotes: 4

Views: 4462

Answers (3)

lax1089
lax1089

Reputation: 3473

The reason you are receiving different responses each time is because you have set temperature to a relatively high value of 1.

Temperature is the parameter that governs the randomness and creativity in responses, and can be set to a number between 0-2. With higher temperature values, it will take more "risks" and choose lower probability tokens. If you want to make it as close to deterministic as possible, set temperature to 0.

Separately, OpenAI recommends setting either temperature or top_p, but not both. You currently are setting both. If you want to control response variance via temperature, I suggest removing the top_p parameter from your requests.

Important: As stated by Boris Power (Head of Applied Research at OpenAI), even with temperature at 0 inference is non-deterministic when top-2 token probabilities are <1% different. So while responses should largely be similar between multiple runs of the same prompt (with temperature=0), there will likely still be be some variance. Once you get one different token then the completions might start to diverge more.

Upvotes: 1

Yassine Khachlek
Yassine Khachlek

Reputation: 1144

OpenAI documenation:

https://beta.openai.com/docs/api-reference/completions/create

temperature number Optional Defaults to 1

What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer.

We generally recommend altering this or top_p but not both.

top_p number Optional Defaults to 1

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

Upvotes: 0

Duy Bui
Duy Bui

Reputation: 1396

I just talked to OpenAI and they said that their response is not deterministic. It's probabilistic so that it can be creative. In order to make it deterministic or reduce the risk of being probabilistic, they suggest adjusting the temperature parameter. By default, it is 1 (i.e. 100% taking risks). If we want to make it completely deterministic, set it to 0.

Another parameter is top_p (default=1) that can be used to set the state of being deterministic. But they don't recommend tweaking both temperature and top_p. Only one of them would do the job.

Upvotes: 2

Related Questions