Reputation: 2198
I am trying a basic script to summarize text:
def generate(self, text_to_summarize):
vertexai.init(project="<PROJECT_ID", location="MY_REGION")
model = GenerativeModel(
"gemini-1.5-flash-002",
system_instruction=[my_prompt]
)
responses = model.generate_content(
[text_to_summarize],
stream=True,
)
for response in responses:
print(response.text, end="")
This works as intended locally, using "gemini-1.5-flash-002"
In order to run in gcloud run, I have built the script in a docker container and have deployed it to gcloud run.
Calling the endpoint then fails with error:
"PermissionDenied(\"Permission 'aiplatform.endpoints.predict' denied on resource '//aiplatform.googleapis.com/projects/<PROJECT-ID>/locations/<REGION>/publishers/google/models/gemini-1.5-flash-002' (or it may not exist).\")"
I have double-checked permissions with gcloud projects get-iam-policy <PROJECT-ID>
and see:
bindings:
- members:
- serviceAccount:service-<CODE>@gcp-sa-vertex-op.iam.gserviceaccount.com
role: roles/aiplatform.onlinePredictionServiceAgent
- members:
- serviceAccount:service-<CODE>@gcp-sa-aiplatform.iam.gserviceaccount.com
role: roles/aiplatform.serviceAgent
- members:
- serviceAccount:<CODE>[email protected]
- user:<MY-EMAIL>
role: roles/aiplatform.user
...
I checked the models here and aiplatform.endpoints.predict
is a permission for roles/aiplatform.user
, so I have permission.
This has led me to conclude the model does not exist. I thought gcloud run would automatically use the gemini flash one as it does locally. I have run
gcloud ai models list --region=<REGION>
and there are no models.
Even trying to deploy that model to my endpoint fails. The code to deploy is:
gcloud ai endpoints deploy-model <MY-ENDPOINT-ID>\
--model=gemini-1.5-flash-002 \
--region=<REGION> \
--display-name="flash-deployment" \
--machine-type="n1-standard-4"
and this fails with
(gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/<PROJECT-ID>/locations/<REGION>/models/gemini-1.5-flash-002' exists.
I think I have to register the model somewhere, but when I open the model registry and try to "Create" one, it asks me for training data and so on. I do not want to train a new model, just use the flash pretrained one.
Does anyone know how this can be achieved?
Upvotes: 0
Views: 387