How to count prompt and completion tokens using Vercel's AI SDK?

Question

I know that their API returns usage onFinish, but I want to count the tokens myself.

I am trying to count tokens for gpt-4o-2024-05-13, which I can tokenize using https://www.npmjs.com/package/gpt-tokenizer

However, the problem that I am running into is that there is a wildly big difference between what I am able to count as the input and what Vercel reports (OpenAI logs match Vercel reporting, so I know it is accurate).

const { fullStream } = await streamText({
  abortSignal: signal,
  maxSteps: 20,
  messages: truncatedMessages,
  model: createModel(llmModel.nid),
  tools: await createTools({
    chatSessionMessageId,
  }),
});

for await (const chunk of fullStream) {
  // ...
}

so assuming that this is how I am sending messages to the LLM, and that I am streaming the response, and that I have a function tokenize(subject: string): string[], what's the correct way to calculate the tokens used by the prompt and completion?

For context, what I've tried was something like:

for await (const chunk of fullStream) {
  content += chunk.textDelta;
}

tokenize(content).length

I would expect that this gives accurate completion_tokens, but the Vercel reported number is almost 40% higher.

I tried this to count input:

truncatedMessages
        .map((message) => {
          return message.content;
        })
        .join('
'),

but that's also a lot less than what Vercel/OpenAI reports.

Where do the extra tokens come from?

How to count prompt and completion tokens using Vercel's AI SDK?

Answers (0)

Related Questions

How to count prompt and completion tokens using Vercel&#39;s AI SDK?

Answers (0)

Related Questions

How to count prompt and completion tokens using Vercel's AI SDK?