hassaan hameed
hassaan hameed

Reputation: 71

How can i Get a JSON structure response from the google gemini-pro-vision

google gemini_response is string I need in the JSON Structure like this

{ "Title": "chair", "Description": "A wooden chair with wheels", "Category": "Furniture", "Subcategory": "office stuff", "EstimatedPrice": "$10 - 20 " }

I try the json_laod() to desearlize but my reponse is not like JSON! How can i do this into a proper JSON ?

Upvotes: 6

Views: 10382

Answers (2)

Ronnie Smith
Ronnie Smith

Reputation: 18595

Looks like it's returning JSON by default? https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini#response

However, see

JSON format responses

Depending on your application, you may want the response to a prompt to be returned in a structured data format, particularly if you are using the responses to populate programming interfaces. The Gemini API provides a configuration parameter to request a response in JSON format.

Note: This response configuration option is supported only with the Gemini 1.5 Pro and 1.5 Flash models.

Basically add this to a HTTP POST body to get JSON:

      "generationConfig": {
            "response_mime_type": "application/json",
      }

Upvotes: 1

Rosário P. Fernandes
Rosário P. Fernandes

Reputation: 11336

JSON Mode has recently been introduced in the Gemini API for the Gemini 1.5 Pro and Gemini 1.5 Flash models (sadly not available in the Gemini 1.0 Pro Vision model that you're using).

But both models support image input, just like the pro vision model, as documented here.

Using the Gemini 1.5 Flash model

This model is somewhat limited when it comes to schema - you have to pass it as part of your prompt, for example:

model = genai.GenerativeModel('gemini-1.5-flash',
                              generation_config={"response_mime_type": "application/json"})

prompt = 'List 5 office supplies for my e-commerce website, with this schema: { "Title": str, "Description": str, "Category": str, "Subcategory": str, "EstimatedPrice": str}'

response = model.generate_content(prompt)
print(response.text)

Using the Gemini 1.5 Pro model

If you're using the Gemini 1.5 Pro model, You can create a Python class to serve as schema and pass it as response_schema:

import typing_extensions as typing

# Define your schema
class Product(typing.TypedDict)
  title: str
  description: str
  category: str
  subcategory: str
  estimatedPrice: str

# Call the API
model = genai.GenerativeModel(model_name="models/gemini-1.5-pro")

result = model.generate_content(
  "List 5 office supplies for my e-commerce website",
  generation_config=genai.GenerationConfig(response_mime_type="application/json",
                                           response_schema = list[Product]))

print(result.text)

Which model should I use ?

I think that depends on what you're looking for. The differences between the models are listed in the docs and they also come at different prices.

Upvotes: 4

Related Questions