Reputation: 11
I am using langchain with a large chunk of text - journalistic info. Can i force langchain to give results of what I only have stored in data and not search in ChatGPT database? I know it needs to connect to OpenAI to read my data, but i need to force an "offline" search only.
Is there a better LLM option?
Thank you!
For my first and second time asking a question it says that there is no info, third time it connects to ChatGPT to give an answer - i need to at least know from where it is extracting the answer from.
Upvotes: 1
Views: 3607
Reputation: 2470
There are lots of LLMs from HuggingFace, which will be downloaded onto your disk, when the code is executed first time, and provide query locally (no internet required), eg. doing similarity_search
with embedding. For example:
from langchain.embeddings import HuggingFaceBgeEmbeddings
embeddings = HuggingFaceBgeEmbeddings(
model_name="sentence-transformers/all-mpnet-base-v2"
)
Upvotes: 1
Reputation: 49561
you can load a specific file and have your llm answer based on this context. There are variety of methods to implement this. For example:
import os
from langchain.document_loaders import PyPDFLoader
os.environ["OPENAI_API_KEY"] = "your-api-key"
pdf_loader = PyPDFLoader('./your_doc.pdf')
documents = pdf_loader.load()
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm=OpenAI())
query = 'what is your question?'
response = chain.run(input_documents=documents, question=query)
print(response)
Or, Huggingface has some offline models that you download it to your machine. This will guide you how to set huggingface transformers
and models offline.
Beware that you need strong hardware to interact with the llm model offline.
Upvotes: 0