Pasindu Lakshan
Pasindu Lakshan

Reputation: 100

How to combine multiple FAISS indexes into one to get a single retriever

pdf = load_pdf(help_doc_name)
faiss_index_ft9Help = FAISS.from_documents(pdf, OpenAIEmbeddings())
faiss_index_ft9Help.save_local(index_path + "/" + help_doc_name)

# load newsletters
pdf = load_pdf(newsletters_doc_name)
faiss_index_newsletters = FAISS.from_documents(pdf, OpenAIEmbeddings())
faiss_index_newsletters.save_local(index_path + "/" + newsletters_doc_name)

# load support cases
pdf = load_pdf(supportCases_doc_name)
faiss_index_supportCases = FAISS.from_documents(pdf, OpenAIEmbeddings())
faiss_index_supportCases.save_local(index_path + "/" + supportCases_doc_name)

retriever = MultiIndexRetriever(
    [faiss_index_ft9Help, faiss_index_newsletters, faiss_index_supportCases])

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=False
)

The MultiIndexRetriever method is not existing, I need to create a single retriever from three faiss indexes. Because I need to use those three indexes separately afterward to get reference pages by doing a similarity search. Is there any way to do this or any alternative way better than this? This is the part where I used this chain.

while True:
    question = input("You: ")

    if question.lower() == "exit":
        print("Bot: Goodbye!")
        break

    response = qa_chain.run(question)

    print("Bot: " + response + "\n\n")

Please note that still, I didn't implement the reference getting part.

Upvotes: 2

Views: 7908

Answers (1)

Muhammad Safwan
Muhammad Safwan

Reputation: 1024

the thing you are looking for is merge_from

You can use it like this

pdfs = [help_doc_name, newsletters_doc_name, supportCases_doc_name]

for index, pdf in enumerate(pdfs):
   content = load_pdf(pdf)
   if index == 0:
       faiss_index = FAISS.from_documents(content, OpenAIEmbeddings())
   else:
      faiss_index_i = FAISS.from_documents(content, OpenAIEmbeddings())
      faiss_index.merge_from(faiss_index_i)

faiss_index.save_local(index_path)

retriever = faiss_index.as_retriever(
        search_type="similarity", search_kwargs={"k": 3}
    )
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=False
)

Iterating through the PDFs List. creating the faiss index for the first time and then merging the rest.

Upvotes: 5

Related Questions