Charlie Parker
Charlie Parker

Reputation: 5189

How to use the max number of tokens with the openai api?

I want to extract ("parse out") some text from some "unstructured data". So the only robust way I've found to do this is using GPT4. When I try it on ChatGPT (with GPT4) web interface it works -- though because it's a lot of text I do have to click the continuation button. But it works as I wanted.

But when I use it with the openai API I get that it terminates to soon. Which is confusing me. How does one make sure that the max number of tokens for the model are always used?

Extract text

def extract_content_from_with_gpt(
        text_list: str, 
        model_name: str = 'gpt-4-1106-preview', 
        # model_name: str = 'gpt-3.5-turbo', 
        temperature: float = 0.7,
    ) -> list[tuple[int, str]]:
    prompt: str = get_prompt_extract_list_from_text(text_list)
    response = openai.chat.completions.create(
        model = model_name,
        messages = 
        [
            {"role": "user", "content": prompt},
        ],
        temperature = temperature,
        # max_tokens=128_000,
    )
    response_text: str = response.choices[0].message.content.strip()
    pattern = r"<extracted>(.*?)</extracted>"
    match: list[str] = re.findall(pattern, response_text, re.DOTALL)
    assert len(match) == 1
    # extracted_list: list[str] = eval(match[0])
    extracted_list: list[str] = ast.literal_eval(match[0])
    return extracted_list

is this how it's done?

related: https://community.openai.com/t/max-tokens-how-to-get-gpt-to-use-the-maximum-available-tokens/433367

Upvotes: 1

Views: 10652

Answers (2)

sree.s
sree.s

Reputation: 1

  1. gpt-4-1106-preview (GPT4-Turbo): 4096

  2. gpt-4-vision-preview (GPT4-Turbo Vision): 4096

  3. gpt-3.5-turbo-1106 (GPT3.5-Turbo): 4096

Upvotes: -2

Rajib Deb
Rajib Deb

Reputation: 1774

I see that you are using gpt-4-1106-preview which is the turbo model. The input and output tokens for gpt-4 turbo are not the same. It has an input token limit(context space) of 128 K but output is limited to 4096 tokens. Although max token is 4096, it does not guarantee that you will always get that many tokens, it will stop when it encounters a stop token.

Upvotes: 2

Related Questions