Zahra Reyhanian
Zahra Reyhanian

Reputation: 99

implement a search engine chain using tavily in langchain

I want to implement a search engine chain using tavily in langchain. This chain gives user's query as an input and returns up to 5 related documents. Each retrieved document must have the content of the document as page_content and the url of the corresponding site as metadata under the definition of LangChain Documents. I must use langchain_core.documents.base.Document class to define documents. So this chain will have two main parts:

  1. Tavily search platform
  2. Parser with the aim of converting search output data into standard LangChai documents.

I wrote this code but I don't know how to change tavily output format into standard form of document:

from langchain_core.documents.base import Document
from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults(max_results=5)

class ParsedDocument(BaseModel):
    content: str = Field(description="This refers to the content of the search.")
    url: str = Field(description="This refers to the url of the search.")

search_parser = PydanticOutputParser(pydantic_object=ParsedDocument)
search_engine_chain = search | search_parser

I would be grateful if you could help me how to change this code.

Upvotes: 2

Views: 414

Answers (1)

Zahra Reyhanian
Zahra Reyhanian

Reputation: 99

I finally found the answer:

class ParsedDocument(BaseModel):
    content: str = Field(description="This refers to the content of the search.")
    url: str = Field(description="This refers to the url of the search.")

# Define a custom parser
def custom_parser(search_results):
    parsed_documents = []
    for result in search_results:  # Adjust this line based on the actual structure of search_results
        parsed_document = ParsedDocument(content=result['content'], url=result['url'])
        document = Document(page_content=parsed_document.content, metadata={'url': parsed_document.url})
        parsed_documents.append(document)
    return parsed_documents

search_engine_chain = search | custom_parser

Upvotes: 0

Related Questions