Ada Boese
Ada Boese

Reputation: 127

How to count prompt and completion tokens using Vercel's AI SDK?

I know that their API returns usage onFinish, but I want to count the tokens myself.

I am trying to count tokens for gpt-4o-2024-05-13, which I can tokenize using https://www.npmjs.com/package/gpt-tokenizer

However, the problem that I am running into is that there is a wildly big difference between what I am able to count as the input and what Vercel reports (OpenAI logs match Vercel reporting, so I know it is accurate).

const { fullStream } = await streamText({
  abortSignal: signal,
  maxSteps: 20,
  messages: truncatedMessages,
  model: createModel(llmModel.nid),
  tools: await createTools({
    chatSessionMessageId,
  }),
});

for await (const chunk of fullStream) {
  // ...
}

so assuming that this is how I am sending messages to the LLM, and that I am streaming the response, and that I have a function tokenize(subject: string): string[], what's the correct way to calculate the tokens used by the prompt and completion?

For context, what I've tried was something like:

for await (const chunk of fullStream) {
  content += chunk.textDelta;
}

tokenize(content).length

I would expect that this gives accurate completion_tokens, but the Vercel reported number is almost 40% higher.

I tried this to count input:

truncatedMessages
        .map((message) => {
          return message.content;
        })
        .join('\n'),

but that's also a lot less than what Vercel/OpenAI reports.

Where do the extra tokens come from?

Upvotes: 0

Views: 207

Answers (0)

Related Questions