Rob
Rob

Reputation: 99

Add limits to requests (quotas)

I have been looking at the different quotas for VertexAI.

I have checked the "quotas & system limits" for Vertex AI and there are thousands of quotas.

I am currently testing Vertex AI SDKs specifically Gemini and other models. I am trying things like ChatPrompts, TextPrompts, etc.

Eg.: https://cloud.google.com/vertex-ai/docs/generative-ai/text/test-text-prompts

I would like to limit the API requests per minute/day. Can someone help me understand which quotas should I limit in the "quotas & system limits" as there are thousands of quotas.

Thanks

Upvotes: 0

Views: 784

Answers (1)

2bob
2bob

Reputation: 1

One can implement some sort of delay before/after each API request. It depends on what application/language you use.

You got the start here: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas.

However, you might find it easier if you filter for base_model:gemini-pro and your region of choice. When you locate your item, on the far right you have the 'more actions menu' (3 vertical dots) for that item which will give you the possibility to 'Create usage alert'.

Hope this helps.

Upvotes: 0

Related Questions