Make gemini-1.5-flash-002 accesible for my GCloud Run project

Question

I am trying a basic script to summarize text:

def generate(self, text_to_summarize):
    vertexai.init(project="


This works as intended locally, using "gemini-1.5-flash-002"
In order to run in gcloud run, I have built the script in a docker container and have deployed it to gcloud run.
Calling the endpoint then fails with error:
"PermissionDenied(\"Permission 'aiplatform.endpoints.predict' denied on resource '//aiplatform.googleapis.com/projects//locations//publishers/google/models/gemini-1.5-flash-002' (or it may not exist).\")"

I have double-checked permissions with gcloud projects get-iam-policy  and see:
bindings:
- members:
  - serviceAccount:service-@gcp-sa-vertex-op.iam.gserviceaccount.com
  role: roles/aiplatform.onlinePredictionServiceAgent
- members:
  - serviceAccount:service-@gcp-sa-aiplatform.iam.gserviceaccount.com
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:-compute@developer.gserviceaccount.com
  - user:
  role: roles/aiplatform.user
...

I checked the models here and aiplatform.endpoints.predict is a permission for roles/aiplatform.user, so I have permission.
This has led me to conclude the model does not exist. I thought gcloud run would automatically use the gemini flash one as it does locally. I have run
gcloud ai models list --region=
and there are no models.
Even trying to deploy that model to my endpoint fails. The code to deploy is:
gcloud ai endpoints deploy-model \
   --model=gemini-1.5-flash-002 \
   --region= \
   --display-name="flash-deployment" \
   --machine-type="n1-standard-4"

and this fails with
(gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects//locations//models/gemini-1.5-flash-002' exists.

I think I have to register the model somewhere, but when I open the model registry and try to "Create" one, it asks me for training data and so on. I do not want to train a new model, just use the flash pretrained one.
Does anyone know how this can be achieved?

Make gemini-1.5-flash-002 accesible for my GCloud Run project

Answers (0)

Related Questions