martinrame
martinrame

Reputation: 1

How can I add filters to VertexAI queries using langchain?

I have uploaded documents with metadata to VertexAI Search and Conversation and I am using langchain to build a RAG model based on the documents I uploaded. I would like to add filters on the documents used to generate the answer based on the metadata I uploaded.

The metadata is in this format:

{"id": 1,
 "structData": {"id": "2743", 
                "tags": "Operation",
                "language": "English"
                },
 "content": {"mimeType": "text/plain", "uri": #link to gcs location of the file}}

For example, I would like to get an answer only based on documents that have "English" as "language".

I use this code to get the answer to my query using langchain:

import vertexai
from langchain.llms import VertexAI
from langchain.chains import RetrievalQA
from langchain.retrievers import GoogleVertexAISearchRetriever

vertexai.init(project=PROJECT_ID, location=REGION)
llm = VertexAI(model_name=MODEL)

retriever = GoogleVertexAISearchRetriever(
    project_id=PROJECT_ID,
    location_id=DATA_STORE_LOCATION,
    data_store_id=DATA_STORE_ID,
    get_extractive_answers=True,
    max_documents=10,
    max_extractive_segment_count=1,
    max_extractive_answer_count=5
)

search_query = "What are some spices that can help lower blood sugar?"

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)

results = retrieval_qa({"query": search_query})

I already tried to pass a "filter" argument to the retriever I am using based on the langchain documentation but I always run into errors, no matter which syntax I am using.

Upvotes: 0

Views: 566

Answers (1)

kephin
kephin

Reputation: 11

I assume you're using Google Cloud, so you need to make sure the following:

  1. The metadata is properly uploaded when you create the agent builder. you should see the 'Schema' based on your metadata under the data store.
  2. Check the 'Schema' and make sure the field you want to filter, for example "language", is 'indexable'
  3. Now you can add 'filter' into GoogleVertexAISearchRetriever like so,
retriever = GoogleVertexAISearchRetriever(
    filter="language: ANY(\"English\")",
    project_id="123",
    ...
)

You can check this link for more information about filter expressions.

Upvotes: 1

Related Questions