Reputation: 161
I was wondering if someone has experience with real-life RAG flows.
A very simple scenario
User asks: Give me the top 5 takeaways "search phrase" (e.g. MS keynote)
The code behind it runs an Azure AI search and returns five relevant documents - that are passed via a prompt including the user's original question to the AI
The system answers with a text with the top 5 takeaways
And this is where most of examples on the internet end
How do you handle the following question from user:
Give me 5 more takeaways
If you ran azure search with this phrase, you will not for sure get anything relevant to the original question, or do you pass this question straight to AI where you include chat history, original question, original RAG and AI answer?
What are your experiences?
thanks
Upvotes: 1
Views: 263
Reputation: 11
Ohk so one way it did this was to use a "prompt_rephrasing_agent." So basically what this does is it uses the chat history and the new prompt to create a new prompt which has a better result when doing RAG. So for your example, the first prompt was top takeaways. So now when you ask the question, give me 5 more, the rephrasing agent can rephrase the query to top 10 or anyway it decides to improve the retrieval.
Now let's say that the retrieved documents for the first 5 and next 5 are the same. This case is solved in the answer_generator agent. As this agent also has the chat history, it will understand that it has to state takeaways other than the 5 it gave before.
So basically the solution I am suggesting is to add an agentic workflow where many decisions are taken by the LLMs like evaluating the retrieved documents, if there is a need to regenerate the solution. Hell, you can also add a censorship agent which will evaluate the reply to be safe or not safe.
I created a CLI RAG project using LangGraph [GitHub]. Look at the diagram in the readme, there is a "transform_query" node that basically does the things that I said above.
Upvotes: 0
Reputation: 1105
I think you're blurring the lines between when context retrieval takes place and when enabling the "chat like" interaction between a user and the LLM.
There a many ways that you may trigger a retrieval.
Consider your first prompt: "I want to summarize the top 5 docs and get 5 insights ..." This prompt is making a logical request that some records be fetched if they are not already present.
Consider the preceding prompt: "Give me five more insights..." This prompt is acknowledging an implied context (a.k.a, that information already known is available for interpretation).
Given that your using ChatGPT functions, you need to tune your function description to describe WHEN it should be invoked. In this case, that's roughly when "if context or information is missing that's needed to respond accurately to the user, invoke ...".
Doing this will allow the model to retrieve documents and information throughout the conversation when necessary, while also responding to the user accurately using any context that had been previously loaded.
Lastly, consider templating your system message to have a dedicated area for appending retrieved context, as opposed to adding it to the last message. I've found more reliable results using that strategy.
Upvotes: 0
Reputation: 9
One thing that you can think of is filtering documents on the basis of their search score (which is a floating point number as far as Azure Ai Search is concerned).
So the first step would be to retrieve the first five documents on the basis of their search score and then store the search score of the last document and then when the user again passes a query saying "return me five more relevant documents", then filter out five more documents based on their search score starting with the search score which will be less than the search score of the last of the 5 documents returned in the original query.
In the following code I have made use of function calling functionality to fetch the parameters viz 'keyphrase' and 'search_score'. Then I have created two functions: one is returnFinalSearchScore() and the other is Search().
The function returnFinalSearchScore() returns the '@search_score' of the last of the top 5 documents thus helping us in the subsequent prompts and the function search() indexes the documents based on their '@search_score' values using azure Ai Search and then extracts the keywords from the particular document and passes those keywords into the chat engine for document summarization.
#importing important namespaces
import os
import openai
import requests
import json
from dotenv import load_dotenv
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
#setting important configurations
load_dotenv()
index_name="azureblob-index"
endpoint = os.getenv('search_endpoint')
key = os.getenv('search_key')
openai.api_type = "azure"
openai.api_base = os.getenv('oai_base')
openai.api_version = "2023-09-15-preview"
openai.api_key = os.getenv('oai_key')
#create an azure search client
global credential
credential = AzureKeyCredential(key)
global client
client = SearchClient(endpoint=endpoint, index_name=index_name, credential=credential)
#creating a function definition
functions=[
{
"name":"retrieve_documents",
"description":"retrieves documents from the azure Ai search index based on the given keyphrase",
"paramters":{
"type":"object",
"properties":{
"keyphrase":{
"type":"string",
"description":"the keyphrase to be used for searching the documents through azure ai search"
},
"search_score":{
"type":"floating-point number",
"description":"the search score of the last indexed document"
}
},
"required":["keyphrase","search_score"]
}
}
]
#intial message prompt to be fed into the engine
messages=[
{"role":"system","content":"you are an Ai assistant that helps people retrieve information"},
{"role":"user","content":"I want to summarise the top 5 documents with the keyphrase'Open Ai' with the search score of the last indexed document as 100 "}
]
#storing the inital prompt response from the engine in a variable called as intial_repsonse
initial_response = openai.ChatCompletion.create(
engine="YOUR_ENGINE_NAME",
messages=messages,
functions=functions,
temperature=0.7
)
#extracting keyphrase and search_score paramters from the initial_response
response_message=initial_response["choices"][0]["message"]
function_name=response_message["function_call"]["name"]
arguments=[]
lst=[]
function_args=json.loads(response_message["function_call"]["arguments"])
for key in function_args:
arguments.append(function_args[key])
keyphrase=arguments[0]
search_score=arguments[0]
def returnFinalSearchScore(keyphrase,search_score):
results=client.search(search_text=keyphrase)
finalSearchScore=results[4]['@search.score']
print('the search score of the final document is ', finalSearchScore)
#creating a function to search the top 5 documents according to the extracted keyphrase
def search(keyphrase, search_score):
results = client.search(search_text=keyphrase)
count=1
for result in results:
if(count<=5):
if result['@search.score']<search_score: #listing down only the keyphrases with the most confident search score
lst=result['keyphrases']
count=count+1
counter=0
while counter<4:
sum=""
for keyphrases in lst[counter]:
sum=sum+keyphrases
messages=[
{
"role":"system","content":"you are an Ai help assistant that helps the user summarise the document based on the set of keywords given to you"
},
{
"role":"user", "content":sum
}
]
message=openai.ChatCompletion.create(
engine="your_engine_name"
messages=messages,
temperature=0.7
)
print(message["choices"][0]["message"]["content"])
counter+=1
sum=""
#running the final code through the below commands
search(keyphrase,search_score)
returnFinalSearchScore(keyphrase,search_score)
Upvotes: 0