Dev_Man
Dev_Man

Reputation: 896

Getting scores with langchain retriever

I am trying to get the scores of the documents retrieved when using langchain retrievers. Below is a snippet from my implementation of the retriever without scores currently.

vectorstore = FAISS.from_documents(docs, embeddings_model)    
semantic_retriever = vectorstore.as_retriever(k = 4)
result = semantic_retriever.invoke(query)

So I looked it up and langchain themselves have defined a way to do the same by writing a custom retriever wrapper and adding the similarity score to the metadata of the document. Here https://python.langchain.com/v0.2/docs/how_to/add_scores_retriever/

Honestly that approach may work but I am not sure if it makes sense. It doesn't to me. Why should a score become a part of the permanent metadata of the document. Also what's the difference between invoke and similarity_search_with_score? This is langchain 0.2 by the way.

Also how to get similarity scores for BM25 retriever, ensemble retriever coming from from langchain.retrievers import EnsembleRetriever, BM25Retriever

Upvotes: 1

Views: 1018

Answers (1)

Dev_Man
Dev_Man

Reputation: 896

This is how I made it later, I didn't add score to the metadata but did still have to go up to the vectorstore level to get scores retrieved along with the documents.

from langchain_core.vectorstores import VectorStoreRetriever    
class CustomSemanticRetriever(VectorStoreRetriever):
    def invoke(
        self, input: str, config: Optional[RunnableConfig] = None, **kwargs: Any
    ) -> List[Document]:
        if len(input) > max_embedding_length:
            input = input[:max_embedding_length]
        return self.vectorstore.similarity_search_with_relevance_scores(
            input, k=self.search_kwargs["k"]
        )

Upvotes: 0

Related Questions