Neil C. Obremski
Neil C. Obremski

Reputation: 20314

How to encode/decode tokens in `vertexai` for pre-trained (bison) chat models?

Is there a tiktoken-equivalent either within vertexai or google.cloud.aiplatform for pretrained chat and text models such as bison?

I want to be able to count tokens before sending a request so that I can programmatically determine which and how much information to place into the context, examples, and message_history properties.

The closest thing I can find is this reference to an API endpoint that returns the billable tokens used: https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count . I'd rather not have to make a slow HTTPS roundtrip just to get a count.

Also, the send_message exception isn't clear when the input size is the cause which makes it unusable as a signal for truncation. The exception is this rather vague message:

400 The request cannot be processed. The most likely reason is that the provided input exceeded the model's input token limit.

Upvotes: 6

Views: 1404

Answers (1)

Laurent PICARD
Laurent PICARD

Reputation: 56

  • You can count text tokens locally with the Vertex AI SDK for Python (starting with version 1.57.0), assuming you're now using Gemini instead of Bison.
  • Check out this Medium article for details: Counting Gemini text tokens locally.

Upvotes: 0

Related Questions