Reputation: 127
In the context of a personal project, Im trying to use Vertex AI to run a TFX pipeline to train a model using custom training, based on this guide. When I run the pipeline I get the error:
com.google.cloud.ai.platform.common.errors.AiPlatformException: code=RESOURCE_EXHAUSTED, message=The following quota metrics exceed quota limits: aiplatform.googleapis.com/custom_model_training_cpus
On the IAM quotas I have limit "1" for the resource "Custom model training CPUs for N1/E2 machine types per region", for all regions, and 0% current usage for each one of them. I even tried multiple regions and multiple types of machines (n1, e2, ...) and I alway get that quota limit error.
Can anyone explain why Im getting this quota error?
Upvotes: 2
Views: 2595
Reputation: 31
On the IAM quotas I have limit "1" for the resource "Custom model training CPUs for N1/E2 machine types per region", for all regions, and 0% current usage for each one of them.
The default quotas for aiplatform.googleapis.com/custom_model_training_cpus
are listed here, and their range is between 20 and 2,200, depending on the region. I'm not sure why your limit would be 1, but I believe that means you will not be able to use any machine types that use more than one vCPU, so you can't even use "n1-standard-2", even if your pipeline would only use one machine of type "n1-standard-2" for training.
One thing you can try is editing the quotas for your GCP project from the quotas page by selecting the quota(s) in the table and then clicking the "Edit quotas" button.
Upvotes: 1