jasan
jasan

Reputation: 12927

How to store chat history using langchain conversationalRetrievalQA chain in a Next JS app?

Im creating a text document QA chatbot, Im using Langchainjs along with OpenAI LLM for creating embeddings and Chat and Pinecone as my vector Store.

See Diagram: enter image description here

After successfully uploading embeddings and creating an index on pinecone. I am using Next JS app to communicate with OpenAI and Pinecone.

The current structure of my App is like so:

1: Frontend -> user inputs a question and makes a POST call to a NEXT js server API route /ask

2: The server func looks like the following:

const vectorStore = await PineconeStore.fromExistingIndex(
        new OpenAIEmbeddings(),
        { pineconeIndex })

const model = new ChatOpenAI({ temperature: 0.5, modelName: 'gpt-3.5-turbo' })

const memory = new ConversationSummaryMemory({
    memoryKey: 'chat_history',
    llm: new ChatOpenAI({ modelName: 'gpt-3.5-turbo', temperature: 0 }),
})

const chain = ConversationalRetrievalQAChain.fromLLM(
    model,
    vectorStore.asRetriever(),
    {
        memory,
    }
)

const result = await chain.call({
    question,
})

return NextResponse.json({ data: result.text })

The ISSUE: The Chatbot never has access to any history because the memory always ONLY has ONLY the latest message in storage.

console.log('memory:', await memory.loadMemoryVariables({}))

I also tried BufferMemory and same Issue, the memory buffer only has ONLY the message just asked, if new query comes in the buffer is cleared and the new query is the only message in memory.

I may be unclear on how exactly to correctly store the history so the const result = await chain.call({question,}) call has access to the previous info

UPDATE: I have successfully used Upstash Redis-Backed Chat Memory with langchain, however still curious if it is possible to store messages using User Browser?

Upvotes: 6

Views: 3540

Answers (2)

Yilmaz
Yilmaz

Reputation: 49182

I think you should use returnMessages:true property of ConversationSummaryMemory:

const memory = new ConversationSummaryMemory({
  memoryKey: 'chat_history',
  llm: new ChatOpenAI({ modelName: 'gpt-3.5-turbo', temperature: 0 }),
  returnMessages:true
})

this is the code for loadMemoryVariables from

async loadMemoryVariables(_) {
        if (this.returnMessages) {
            const result = {
                [this.memoryKey]: [new this.summaryChatMessageClass(this.buffer)],
            };
            return result;
        }
        const result = { [this.memoryKey]: this.buffer };
        return result;
    }

if you set returnMessages:true it will return this.summaryChatMessageClass(this.buffer) object otherwise it just returns a string.

Upvotes: 0

lax1089
lax1089

Reputation: 3473

I am using same exact stack as you (Next.js, Pinecone, Langchain ConversationalRetrievalQAChain) and I fought with this exact issue for a while.

Eventually I resorted to the below 'hack' which does work, basically capturing the history in a messages array which I shove into the end of the prompt. To my knowledge this is not all that different than how LangChain's 'Memory' components are supposed to work so I don't feel terrible doing this.

Here is a trimmed down version of my main handler function. Take note of line starting with queryText+='\nIf necessary ...

export async function POST(req: Request) {
    const { messages } = await req.json();
    let queryText = messages[messages.length - 1].content;
    
    queryText+='\nIf necessary, utilize the below chat history as additional context:'+JSON.stringify(messages);
    
    const chain = ConversationalRetrievalQAChain.fromLLM(
            streamingModel,
            vectorStore.asRetriever(),
            {
                returnSourceDocuments: true,
                questionGeneratorChainOptions: {
                    llm: nonStreamingModel,
                },
            }

    const { stream, handlers } = LangChainStream();
    chain.call({ question: queryText }, [handlers]).catch(console.error)

    return new StreamingTextResponse(stream);
}

Upvotes: 0

Related Questions