GCP Gemini API - Send multimodal prompt requests using local image

Question

On this page Google shows a sample code on how to send multimodal prompt requests (image + text).

    import vertexai
    
    from vertexai.generative_models import GenerativeModel, Part
    
    # TODO(developer): Update and un-comment below line
    # project_id = "PROJECT_ID"
    
    vertexai.init(project=project_id, location="us-central1")
    
    model = GenerativeModel(model_name="gemini-1.5-flash-001")
    
    image_file = Part.from_uri(
        "gs://cloud-samples-data/generative-ai/image/scones.jpg", "image/jpeg"
    )
    
    # Query the model
    response = model.generate_content([image_file, "what is this image?"])
    print(response.text)

It works fine.

What I would like to do is to perform the same task but with an image loaded locally. Something like this:

    from PIL import Image

    image_part = Part.from_image(Image.load_from_file("image.jpg"))
    response = model.generate_content([image_part,"what is this image?"])

as written in the docstring of class Part at vertexai/generative_models/_generative_models.py, but this throws this exception:

    module 'PIL.Image' has no attribute 'load_from_file'

Is there any alternative for Part.from_uri for local images?

GCP Gemini API - Send multimodal prompt requests using local image

Answers (1)

Related Questions