How to encode/decode tokens in `vertexai` for pre-trained (bison) chat models?

Question

Is there a tiktoken-equivalent either within vertexai or google.cloud.aiplatform for pretrained chat and text models such as bison?

I want to be able to count tokens before sending a request so that I can programmatically determine which and how much information to place into the context, examples, and message_history properties.

The closest thing I can find is this reference to an API endpoint that returns the billable tokens used: https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count . I'd rather not have to make a slow HTTPS roundtrip just to get a count.

Also, the send_message exception isn't clear when the input size is the cause which makes it unusable as a signal for truncation. The exception is this rather vague message:

400 The request cannot be processed. The most likely reason is that the provided input exceeded the model's input token limit.

How to encode/decode tokens in `vertexai` for pre-trained (bison) chat models?

Answers (1)

Related Questions