Vertex AI Pipeline quota aiplatform.googleapis.com/restricted_image_training_tpu_v3_pod

Question

I'm getting started with creating a tuned model. I've got my training data in a .jsonl file, uploaded to a bucket, everything checks out. I've run the tuning 3 times and every time it fails on step 7/8.

com.google.cloud.ai.platform.common.errors.AiPlatformException: code=RESOURCE_EXHAUSTED, message=The following quota metrics exceed quota limits: aiplatform.googleapis.com/restricted_image_training_tpu_v3_pod, cause=null; Failed to create custom job.Project number: 643054741456, Job id: 7035574795022368768, Task id: -2160728365068189696, Task name: large-language-model-tuner, Task state: DRIVER_SUCCEEDED, Execution name: projects/643054741456/locations/europe-west4/metadataStores/default/executions/6209974609820216962; Failed to create external task or refresh its state. Task:Project number: 643054741456, Job id: 7035574795022368768, Task id: -2160728365068189696, Task name: large-language-model-tuner, Task state: DRIVER_SUCCEEDED, Execution name: projects/643054741456/locations/europe-west4/metadataStores/default/executions/6209974609820216962; Failed to handle the pipeline task. Task: Project number: 643054741456, Job id: 7035574795022368768, Task id: -2160728365068189696, Task name: large-language-model-tuner, Task state: DRIVER_SUCCEEDED, Execution name: projects/643054741456/locations/europe-west4/metadataStores/default/executions/6209974609820216962

I followed the steps here: Vertax AI pipeline quota with no luck.

I searched the quotas and for the quota listed in the error message, it says I'm at 0%.

It also shows no quotas are over 90%.

The docs say that these pipelines only run on us-central1, when I inspect the quota for restricted_image_training_tpu_v3_pod it says my quota is 0. I can request an increase to 1 but I would have thought the docs would mention you can't get started without that.

Here's what the pipeline looks like:

Vertex AI Pipeline quota aiplatform.googleapis.com/restricted_image_training_tpu_v3_pod

Answers (1)

Related Questions