Reputation: 689
I'm using Vertex AI embeddings with LangChain for a RAG application. Reference: https://cloud.google.com/blog/products/ai-machine-learning/improve-gen-ai-search-with-vertex-ai-embeddings-and-task-types/
I've created my embeddings using task_type="QUESTION_ANSWERING". However, I can't figure out how to specify the same task_type during the actual question-answering retrieval process. The code I'm using is below:
from langchain_google_vertexai import VertexAIEmbeddings
from langchain.chains import RetrievalQA
vertex_embeddings = VertexAIEmbeddings(model_name="text-multilingual-embedding-002")
# Some code to retrieve pgvector vector_store
vector_store = get_pgvector(collection_name)
# Create chain to answer questions
NUMBER_OF_RESULTS = 1
SEARCH_DISTANCE_THRESHOLD = 0.6
retriever = vector_store.as_retriever(
search_type="similarity",
search_kwargs={
"k": NUMBER_OF_RESULTS,
"search_distance": SEARCH_DISTANCE_THRESHOLD,
},
)
qa = RetrievalQA.from_chain_type(
llm = get_llm(),
chain_type = "stuff",
retriever = retriever,
return_source_documents = True,
verbose = True,
chain_type_kwargs = {
"prompt": PromptTemplate(
template = prompt_template,
input_variables = ["context", "question"],
),
},
)
I haven't found any way to pass the task_type to the retrieval process. One workaround is to increase NUMBER_OF_RESULTS and then filter the results using sklearn's cosine_similarity based on the task_type, but this adds unwanted latency.
Is there a way to directly specify the task_type during retrieval with langchain_google_vertexai and pgvector so that the most relevant results for question answering are returned directly, avoiding the need for post-processing? Any suggestions or examples would be greatly appreciated!
Upvotes: 0
Views: 111
Reputation: 178
If you want to use embeddings for document search or information retrieval and Q&A use cases such as search, chatbots, or RAG as discussed in the introduction, you need to run two embeddings jobs with different task types:
Use the RETRIEVAL_DOCUMENT task type to create optimized embeddings for your documents (also called a corpus).
Use one of the following task types to create optimized embeddings for your queries, depending on the nature of the queries:
RETRIEVAL_QUERY: Use as the default task type for queries, such as "best restaurants in Vancouver", "green vegetables", or "What is the best cookie recipe?".
QUESTION_ANSWERING: Use in cases where all queries are formatted as proper questions, such as "Why is the sky blue?" or "How do I tie my shoelaces?".
FACT_VERIFICATION: Use in cases where you want to retrieve a document from your corpus that proves or disproves a statement. For example, the query "apples grow underground" might retrieve an article about apples that would ultimately disprove the statement.
Key Point: To get embeddings that you can use for information retrieval, use the RETRIEVAL_DOCUMENT task type to embed your documents and the RETRIEVAL_QUERY task type to embed your queries.
In addition, after you've generated your embedding you can add embeddings to a vector database, like Vector Search. This enables low-latency retrieval, and is critical as the size of your data increases.
Upvotes: 0