Reputation: 164
I was trying to utilize the Azure OpenAI batch jobs.
My application uses Azure OpenAI for enterprise data. So my grounded data is taken from Azure Search (index)
I use GPT 4o-mini. While I make regular REST API calls on the LLM model, I get proper response. Now, I tried with batch process newly available on GPT 4o-mini in East US region.
When I make a call without (Azure search) data_sources tag, I get proper response. But when I include (Azure search) data_sources tag to generate response for my query, I always get empty response.
Here is my request.jsonl
{"custom_id":"task-0","method":"POST","url":"/chat/completions","body":{"model":"gpt-4o-mini-batch","messages":[{"role":"system","content":"You are an AI assistant that answers questions based only on the provided documents."},{"role":"user","content":"Tell me the names of sisters of Diana"}],"top_p":1,"frequency_penalty":0,"max_tokens":1024,"presence_penalty":0,"semantic_configuration":{"name":"my-semantic-config"},"temperature":0,"data_sources":[{"type":"azure_search","parameters":{"endpoint":"https://XXXXXXXXXXXXX.search.windows.net","scope":{"in_scope":true},"index_name":"XXXXXXXXXXXX","key":"XXXXXXXXXXXXXXX","role_information":"You are an AI assistant that helps people find information."}}]}}
Code:
import os
from openai import AzureOpenAI
import dotenv
dotenv.load_dotenv()
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_KEY"),
api_version="2024-07-01-preview",
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
)
currentDirectory = os.getcwd()
file = client.files.create(
file=open(currentDirectory +"/resources/requestJSON.jsonl", "rb"),
purpose="batch"
)
print(file.model_dump_json(indent=2))
file_id = file.id
import time
import datetime
status = "pending"
while status != "processed":
time.sleep(5)
file_response = client.files.retrieve(file_id)
status = file_response.status
print(f"{datetime.datetime.now()} File Id: {file_id}, Status: {status}")
batch_response = client.batches.create(
input_file_id=file_id,
endpoint="/chat/completions",
completion_window="24h",
)
batch_id = batch_response.id
print("BatchId-->" + batch_id)
print(batch_response.model_dump_json(indent=2))
status = "validating"
while status not in ("completed", "failed", "canceled"):
time.sleep(60)
batch_response = client.batches.retrieve(batch_id)
status = batch_response.status
print(f"{datetime.datetime.now()} Batch Id: {batch_id}, Status: {status}")
import json
if status == "completed":
file_response = client.files.content(batch_response.output_file_id)
raw_responses = file_response.text.strip().split('\n')
for raw_response in raw_responses:
json_response = json.loads(raw_response)
formatted_json = json.dumps(json_response, indent=2)
print(formatted_json)
elif status == "failed":
print(f"{datetime.datetime.now()} Batch Id: {batch_id}, Status: {status}")
Can you help to understand if something is incorrect.
Note: My index in Azure Search has embeddings also
Additional info: Here is the 'batch_response':
Batch(id='batch_1b692a78-8c19-454c-a90e-7ad8fd22e396', completion_window='24h', created_at=1724911650, endpoint='/chat/completions', input_file_id='file-0137f769ab9d4e3aae40d1ad42c1e48c', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1724912101, error_file_id='file-cdc16b4a-a9f1-4de3-aaac-61506e615117', errors=None, expired_at=None, expires_at=1724998050, failed_at=None, finalizing_at=1724911948, in_progress_at=1724911860, metadata=None, output_file_id='file-96ec7d4d-ce6b-4b63-aad9-2a623b1a0fd7', request_counts=BatchRequestCounts(completed=0, failed=1, total=1))
From the output, I see both status as completed and also there is failed=1 in request_count. But I dont see any error message.
May be this can help you.
Upvotes: 0
Views: 386
Reputation: 164
Here is the answer. Batch process is not yet supported for grounding data.
Upvotes: 1