Tobias Hermann
Tobias Hermann

Reputation: 10956

How to create cached Gemini content using the VertexAI API in Kotlin/Java?

When trying to create cached content like this:

import com.google.cloud.aiplatform.v1.*
import com.google.protobuf.Duration

val instructions = "hello ".repeat(40000)
val cachedContent =
    CachedContent.newBuilder()
        .setSystemInstruction(
            Content.newBuilder().addParts(Part.newBuilder().setText(instructions)).build()
        )
        .setName("example-cache-2")
        .setTtl(Duration.newBuilder().setSeconds(60 * 60).build())
        .setModel("gemini-1.5-flash")
        .build()
val request = CreateCachedContentRequest.newBuilder().setCachedContent(cachedContent).build()
GenAiCacheServiceClient.create().createCachedContent(request)

it results in the following error:

[...]
<p>The requested URL <code>/google.cloud.aiplatform.v1.GenAiCacheService/CreateCachedContent</code> was not found on this server.  <ins>That’s all we know.</ins>
[...]
com.google.api.gax.rpc.UnimplementedException: io.grpc.StatusRuntimeException: UNIMPLEMENTED: HTTP status code 404
[...]

I know that the env var GOOGLE_APPLICATION_CREDENTIALS is set correctly to my service account file, because getting normal model responses like this works:

import com.google.cloud.vertexai.VertexAI
import com.google.cloud.vertexai.api.GenerationConfig
import com.google.cloud.vertexai.generativeai.ContentMaker
import com.google.cloud.vertexai.generativeai.GenerativeModel
import com.google.cloud.vertexai.generativeai.ResponseHandler

fun foo() {
    val vertexAi = VertexAI(MY_PROJECT, "us-central1")
    val model = GenerativeModel.Builder()
        .setModelName("gemini-1.5-flash")
        .setVertexAi(vertexAi)
        .setSystemInstruction(ContentMaker.fromString("Hi!")!!)
        .build()

    val response = ResponseHandler.getText(
        model
            .withGenerationConfig(GenerationConfig.newBuilder().setTemperature(0.0f).build())
            .generateContent(listOf(ContentMaker.forRole("user").fromString("Hello.")))
    )
    println(response)
}

Also, I know that my service account has the needed permissions because in Python, I can create cached content without problems:

import datetime
import os
import random
import string

import vertexai
from vertexai.generative_models import Part
from vertexai.preview import caching

vertexai.init(project=MY_PROJECT, location="us-central1")

long_text = "hello " * 40000

cached_content = caching.CachedContent.create(
    model_name="gemini-1.5-flash-002",
    system_instruction="Hi!",
    contents=[Part.from_text(long_text)],
    ttl=datetime.timedelta(minutes=60),
    display_name="example-cache",
)

print(cached_content.name)

So, what am I doing wrong in my Kotling code that tries to create cached content?

I am using the latest version of the needed libraries:

implementation(group = "com.google.cloud", name = "google-cloud-vertexai", version = "1.18.0")
implementation(group = "com.google.cloud", name = "google-cloud-aiplatform", version = "3.59.0")

Upvotes: 0

Views: 36

Answers (0)

Related Questions