ARINDAM BANERJEE
ARINDAM BANERJEE

Reputation: 689

RAG - Specifying task_type during Question Answering with Vertex AI Embeddings

I'm using Vertex AI embeddings with LangChain for a RAG application. Reference: https://cloud.google.com/blog/products/ai-machine-learning/improve-gen-ai-search-with-vertex-ai-embeddings-and-task-types/

I've created my embeddings using task_type="QUESTION_ANSWERING". However, I can't figure out how to specify the same task_type during the actual question-answering retrieval process. The code I'm using is below:


from langchain_google_vertexai import VertexAIEmbeddings
from langchain.chains import RetrievalQA

vertex_embeddings = VertexAIEmbeddings(model_name="text-multilingual-embedding-002")

# Some code to retrieve pgvector vector_store
vector_store = get_pgvector(collection_name)

# Create chain to answer questions
NUMBER_OF_RESULTS         = 1
SEARCH_DISTANCE_THRESHOLD = 0.6

retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": NUMBER_OF_RESULTS,
        "search_distance": SEARCH_DISTANCE_THRESHOLD,
    },
)


qa = RetrievalQA.from_chain_type(
    llm                     = get_llm(),
    chain_type              = "stuff",
    retriever               = retriever,
    return_source_documents = True,
    verbose                 = True,
    chain_type_kwargs       = {
        "prompt": PromptTemplate(
            template = prompt_template,   
            input_variables = ["context", "question"],
        ),
    },
)

I haven't found any way to pass the task_type to the retrieval process. One workaround is to increase NUMBER_OF_RESULTS and then filter the results using sklearn's cosine_similarity based on the task_type, but this adds unwanted latency.

Is there a way to directly specify the task_type during retrieval with langchain_google_vertexai and pgvector so that the most relevant results for question answering are returned directly, avoiding the need for post-processing? Any suggestions or examples would be greatly appreciated!

Upvotes: 0

Views: 111

Answers (1)

McMaco
McMaco

Reputation: 178

If you want to use embeddings for document search or information retrieval and Q&A use cases such as search, chatbots, or RAG as discussed in the introduction, you need to run two embeddings jobs with different task types:

  1. Use the RETRIEVAL_DOCUMENT task type to create optimized embeddings for your documents (also called a corpus).

  2. Use one of the following task types to create optimized embeddings for your queries, depending on the nature of the queries:

  • RETRIEVAL_QUERY: Use as the default task type for queries, such as "best restaurants in Vancouver", "green vegetables", or "What is the best cookie recipe?".

  • QUESTION_ANSWERING: Use in cases where all queries are formatted as proper questions, such as "Why is the sky blue?" or "How do I tie my shoelaces?".

  • FACT_VERIFICATION: Use in cases where you want to retrieve a document from your corpus that proves or disproves a statement. For example, the query "apples grow underground" might retrieve an article about apples that would ultimately disprove the statement.

Key Point: To get embeddings that you can use for information retrieval, use the RETRIEVAL_DOCUMENT task type to embed your documents and the RETRIEVAL_QUERY task type to embed your queries.

In addition, after you've generated your embedding you can add embeddings to a vector database, like Vector Search. This enables low-latency retrieval, and is critical as the size of your data increases.

Upvotes: 0

Related Questions