Reputation: 1117
I am trying to create a batch job with textembedding-gecko-multilingual@latest
with no success.
After several failures of
Failed to import data. Not found: Dataset he5311ce8b310161b-tp:llm_bp_tenant_dataset_2558528304743186432 was not found in location US
I have created new dataset with single location - us-central1
the table has only 1 column - content
And tried 2 different executions
1.
import subprocess
import requests
url = 'https://us-central1-aiplatform.googleapis.com/v1/projects/xxx-data/locations/us-central1/batchPredictionJobs'
TOKEN = subprocess.getoutput('gcloud auth print-access-token')
headers = {"Authorization": f"Bearer {TOKEN}"}
req = {
"name": "Gecko_embedding",
"displayName": "Gecko_embedding",
"model": "publishers/google/models/textembedding-gecko-multilingual",
"inputConfig": {
"instancesFormat":"bigquery",
"bigquerySource":{
"inputUri" : "bq://xxx-data.job_results.articles_4_eval",
}
},
"outputConfig": {
"predictionsFormat":"bigquery",
"bigqueryDestination":{
"outputUri": "bq://xxx-data.job_results.articles_4_eval_vectors"
}
}
}
response = requests.post(url, headers=headers, json=req)
import vertexai
vertexai.init(project="xxx-data", location="us-central1")
model = TextEmbeddingModel.from_pretrained("textembedding-gecko-multilingual@latest" )
task_type = 'SEMANTIC_SIMILARITY'
model.batch_predict(
dataset="bq://xxx-data.job_results.articles_4_eval",
destination_uri_prefix="bq://xxx-data.job_results.articles_4_eval_vectors",
)
For both I am getting
Batch prediction job Gecko_embedding encountered the following errors:
Do not support publisher model textembedding-gecko-multilingual
Upvotes: 1
Views: 132