Light Yagmi
Light Yagmi

Reputation: 5235

Data source can't be referred by Azure OpenAI API

I have deployed chat model on Azure OpenAI Studio and given the model my own data source using "Add your data (preview)" feature.

On Chat session in Chat playground page, the chat model can give a correct answer based on the data I gave. However, when I asked the same question to the model via API, the model can not use that data source.

I'd like to use a chat model that use my own data source via API. How do I fix this issue?

Here is what I have tried.

  1. Deploy a gpt-35-turbo model on Azure OpenAI Studio

enter image description here

  1. Add my own data using "Add your data (preview)" feature

enter image description here

  1. The model gives correct answer based on the data on Chat session view

enter image description here

  1. However, model behaves as it does not know the data when I ask the same question via API.
#Note: The openai-python library support for Azure OpenAI is in preview.
import os
import openai
openai.api_type = "azure"
openai.api_base = "https://openai-test-uksouth.openai.azure.com/"
openai.api_version = "2023-03-15-preview"
openai.api_key = "KEY"

response = openai.ChatCompletion.create(
  engine="gpt35turbo",
  messages = [
      {"role":"system","content":"You are an AI assistant that helps people find information."},
      {"role":"user","content":"Summarize `main.py`!"}
              ],
  temperature=0,
  max_tokens=800,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0,
  stop=None)

print(response)

The response is

{
  "id": "chatcmpl-7dtf29DavpRsKGWygZIrJDwj0MDGn",
  "object": "chat.completion",
  "created": 1689743108,
  "model": "gpt-35-turbo",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "I'm sorry, I cannot summarize `main.py` without more information. `main.py` could refer to any Python file and could contain any number of functions or code. Please provide more context or information about the specific `main.py` file you are referring to."
      }
    }
  ],
  "usage": {
    "completion_tokens": 54,
    "prompt_tokens": 32,
    "total_tokens": 86
  }
}

Upvotes: 3

Views: 1994

Answers (4)

Bernard
Bernard

Reputation: 301

Adding your own data from the chat playground only adds it temporarily and when used from the UI. It does not update the deployment config. This is why your API call return any answer related to your data.

Using the notebook from the openai-cookbook (or the quickstart from Azure) I managed to get a working example. Make sure to use the v1 of the OpenAI library. If needed see the Azure OpenAI migration guide.

import json
import openai
import os


endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
deployment = os.environ["AZURE_OPENAI_DEPLOYMENT"]

client = openai.AzureOpenAI(
    base_url=f"{endpoint}/openai/deployments/{deployment}/extensions",
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2023-08-01-preview",
)

completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "What are the differences between Azure Machine Learning and Azure AI services?"}],
    model=deployment,
    extra_body={
        "dataSources": [
            {
                "type": "AzureCognitiveSearch",
                "parameters": {
                    "endpoint": os.environ["SEARCH_ENDPOINT"],
                    "key": os.environ["SEARCH_KEY"],
                    "indexName": os.environ["SEARCH_INDEX_NAME"],
                    # parameters below copied from web playground request response
                    "semanticConfiguration": "default",
                    "queryType": "semantic",
                    "fieldsMapping": {
                        "contentFieldsSeparator": "\n",
                        "contentFields": ["text"],
                        "filepathField": "file_name",
                        "titleField": None,
                        "urlField": None,
                        "vectorFields": [],
                    },
                    "inScope": True,  # probably to restrict generation to use only inputs from retrieval
                        "roleInformation": "You are an AI assistant that helps people find information.",
                        "filter": None,
                    "strictness": 3,
                    "topNDocuments": 5,
                }
            }
        ]
    }
)
print(f"{completion.choices[0].message.role}: {completion.choices[0].message.content}")

# `context` is in the model_extra for Azure
context = completion.choices[0].message.model_extra['context']['messages'][0]['content']
print(f"\nContext: {context}")

Upvotes: 0

user2742409
user2742409

Reputation: 315

For me this sound more like you are missing to add the respective index which is holding your specific data, to the conversation. Via the "View Code" section you are able to retrieve the needed information regarding your Cognitive Search instance and the index.

Upvotes: 0

Bobble Knight
Bobble Knight

Reputation: 13

https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?pivots=rest-api&tabs=command-line#example-curl-commands

This shall help you, i couldn't find support for "dataSources" inside the openAI SDK so you probably need to switch back to simple requests format.

To give more details, the chat preview is just a playground like, you don't actually modify the model or save anything here, so you need to specify in the api call the datasource same as what you would do in Azure playground.

If you do not plan to change the prompt, you can also deploy the model with data added as a web app and directly and call the API from there. ( Need to delete SSO if not used and replace by JWT token ) But i do not advice to use it like that for simple API calls.

Upvotes: 0

Ram
Ram

Reputation: 2754

It's known bug in "Limit your responses to your data content" via API using the gpt-35-turbo, will update once the fix rolled out.

Upvotes: 0

Related Questions