Reputation: 81
I am using Google's Vertex AI SDK for Python to access the Gemini Pro model and generate content. I would like to set a timeout (e.g. 30 seconds) for the generate content call to complete or raise an exception.
Setting a timeout is easy if I use an HTTP library like Requests and query the Gemini REST endpoint directly, but how can I implement the same functionality with the Vertex AI SDK for Python?
Here is a example of the code I use to generate content:
from vertexai import init
from vertexai.preview.generative_models import GenerativeModel
from google.oauth2.service_account import Credentials
credentials = Credentials.from_service_account_file(
'path/to/json/credential/file.json')
init(
project=credentials.project_id, location='northamerica-northeast1',
credentials=credentials)
model = GenerativeModel('gemini-pro')
response = model.generate_content("Pick a number")
print("Pick a number:", response.candidates[0].content.text)
Upvotes: 7
Views: 3419
Reputation: 173
If you would use the async version of generate_content
, you can utilize the native Python's asyncio
function wait_for
. If a timeout occurs, it cancels the task and raises TimeoutError
. The script could look like:
import asyncio
model = GenerativeModel('gemini-pro')
timeout = 5 # seconds
try:
response = await asyncio.wait_for(
model.generate_content_async("Pick a number"),
timeout=timeout,
)
except asyncio.exceptions.TimeoutError:
pass
Upvotes: 3
Reputation: 6812
Currently there is no way to change the timeout.
Are your requests getting stuck?
Upvotes: 0