Reputation: 31
I have a custom JSON file which is created from an excel sheet which contains certain data on which I want my questions to be based on and off which I require answers from OpenAI. Now for this I have a piece of code as follows -
s3 = boto3.client('s3') # read from S3
obj = s3.get_object(Bucket='bucketname', Key='sample.xlsx')
data = obj['Body'].read()
df = pd.read_excel(io.BytesIO(data), sheet_name='randomsheetname')
df = df.to_dict("records") # create JSON dataframe from sheetdata
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{
"role": "system", "content": f"{prompt}. \n\nJSON file: {df}. \n\nAnswer:"
}],
temperature=0.5,
max_tokens=500
)
for which i'm able to get a response to any question that is based on my input JSON file that i'm supplying to openai.ChatCompletion.create()
Now, if i'd want to keep track of my previous conversations and provide context to openai to answer questions based on previous questions in same conversation thread , i'd have to go with langchain. I'm having trouble providing the JSON dataset to my ChatOpenAI() and ConversationChain(), since i'm working with something like this. (WRITTEN USING PYTHON)
llm = ChatOpenAI(temperature=0.5, openai_api_key=api_key, model="gpt-4")
conversation = ConversationChain(
llm=llm,
prompt=prompt_template,
verbose=True,
memory=memory,
chain_type_kwargs=chain_type_kwargs
)
response = conversation.predict(input=prompt)
kindly help.
Upvotes: 3
Views: 4909
Reputation: 41
I use following approach in langchain.
Simple use case for ChatOpenAI in langchain.
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
llm = ChatOpenAI(temperature=0.9,model_name="gpt-3.5-turbo", max_tokens = 2048)
system_text = "You are helpfull assistant that tells jokes"
human_prompt = "Tell a joke"
output_answer = llm.predict_messages([SystemMessage(content = system_text), HumanMessage(content=human_prompt)])
print(output_answer.content)
As you need to provide documents you could probably also look at ConversationalRetrievalChain or other Retrieval options, as they allow to store document context in vector store that is usefull for optimizing token count.
qa = ConversationalRetrievalChain.from_llm(
ChatOpenAI(temperature=0, model="gpt-4"),
vectorstore.as_retriever()
)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
Upvotes: 1