petersergeant
petersergeant

Reputation: 334

What steps can I use to debug a GoogleGenerativeAI Error: Resource has been exhausted (e.g. check quota)

I am using Google's NodeJS SDK for GoogleGenerativeAI. I am making completion requests:

  const modelObj = this._googleAiClient.getGenerativeModel({ model: 'gemini-1.5-flash' });
  const result = await modelObj.generateContent(prompt);

I am using an API key with billing enabled:

enter image description here

However, I'm consistently getting:

GoogleGenerativeAIFetchError: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent: [429 Too Many Requests] Resource has been exhausted (e.g. check quota).

If I visit the quotas page: https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas

and search for gemini-1.5-flash, then the only limit I look remotely close to hitting is "Request limit per model per minute for a project in the free tier" -- however, as this is a Paid Plan, I wouldn't expect to be hitting that.

Does anyone know how I can debug this?

Upvotes: 0

Views: 466

Answers (1)

guillaume blaquiere
guillaume blaquiere

Reputation: 75745

If you are going there, you have quotas at the project level (you might have to select your project first): https://console.cloud.google.com/iam-admin/quotas?referrer=search&pageState=(%22allQuotasTable%22:(%22f%22:%22%255B%257B_22k_22_3A_22_22_2C_22t_22_3A10_2C_22v_22_3A_22_5C_22gemini_5C_22_22%257D%255D%22))

Here a screenshot of chat I have enter image description here

Pick the right model (1.5 flash in your case) and identify what happens then. In summary (in paid plan):

  • 4 millions token per minutes
  • 2k request per minutes

If you are trying to only generate text and you are alone on the project is strange.

If you are multiple users, or if you are running a script to batch generate text, it could happen. Same thing, if you upload 3 or 4 long videos in parallel.

Upvotes: 0

Related Questions