Reputation: 19
I'm testing out google palm API to recursively summarize a long text, and have since come into rate-limiting issues and therefore some questions to verify on.
It seems that the number of requests made to the bison API is 60/min (this seems quite low).
Thanks!
I tried looking into these documents:
1.Rate limit documents: Table for rate limits
2.Increasing rate limits but it seems like it's not meant for the bison model
Upvotes: 1
Views: 1391
Reputation: 135
You can also submit a batch request as part of a Vertex--AI pipeline job https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/batch_eval_llm.ipynb
Upvotes: 0