G. Macia
G. Macia

Reputation: 1521

how to specify similarity threshold in langchain faiss retriever?

I would like to pass to the retriever a similarity threshold. So far I could only figure out how to pass a k value but this was not what I wanted. How can I pass a threshold instead?

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

def get_conversation_chain(vectorstore):
    llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
    qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=vectorstore.as_retriever(search_kwargs={'k': 2}), return_source_documents=True, verbose=True)
    return qa

loader = PyPDFLoader("sample.pdf")
# get pdf raw text
pages = loader.load_and_split()
faiss_index = FAISS.from_documents(list_of_documents, OpenAIEmbeddings())
# create conversation chain
chat_history = []
qa = get_conversation_chain(faiss_index)
query = "What is a sunflower?"
result = qa({"question": query, "chat_history": chat_history}) 

Upvotes: 6

Views: 13289

Answers (2)

Orhan Sönmez
Orhan Sönmez

Reputation: 101

You can use the following as a VectorStoreRetriever as you say but with the search_type parameter.

retriever = dbFAISS.as_retriever(search_type="similarity_score_threshold", 
                                 search_kwargs={"score_threshold": .5, 
                                                "k": top_k})

Upvotes: 10

G. Macia
G. Macia

Reputation: 1521

This was the answer search_kwargs={'score_threshold': 0.3}) from the api docs.

Upvotes: 3

Related Questions