KhoPhi
KhoPhi

Reputation: 9527

Retrieval-augmented generation without OpenAIEmbeddings

I'm playing with HuggingFace and some of the models on there. I'm trying to achieve something along the lines of RAG. Seems like a pretty clear guide with all the needed ingredients and recipe. But the cooking sequence is what I need help.

What I want to do:

Thus

What I've done

With that in mind, here's what I've been able to do so far in the past couple of days (see python script below)

But the problem I keep encountering is, EVERY article/tutorial/documentation I've come across so far CANNOT write 3 sentences without mentioning something along the lines of OpenAI* sigh

I do not want to use ANYTHING OpenAI (is that even possible?)

Below is my python script.

import os
from langchain.chains import LLMChain
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
from huggingface_hub import InferenceClient

# <rant>
# Where to import what from seems to be a whack-a-mole sport with this 
# langchain project. They can't seem to keep a module at a location
# even for a few version upgrades straight. Woow!

os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'hf_xx'
# os.environ["OPENAI_API_KEY"] = 'sk-xx' # I don't wanna use anything OpenAI*

import gradio as gr

# Load openchat model
openchat = InferenceClient(model="openchat/openchat_3.5", token='hf_xx')
documents = ["trainingData/pdfs.pdf"]

# Define embedding function
def extract_embedding(document):
    # Extract embedding using OpenAIEmbeddings
    embedding = OpenAIEmbeddings.embed_query(text=document)
    return embedding

# Load document embeddings
document_embeddings = [extract_embedding(document) for document in documents]

# Load vectorstore index
index = FAISS.from_documents(documents=documents, embeddings=document_embeddings)

# Define LLM chain
llm_chain = LLMChain(llm=openchat)

# Create RagChain manually
def predict(docs, input):
    # Retrieve relevant documents from index
    retrieved_docs = index.query(input)

    # Run LLM chain on retrieved documents
    outputs = []
    for doc in retrieved_docs:
        output = llm_chain.predict(input=doc)
        outputs.append(output)

    return outputs

# Launch Gradio app
iface = gr.Interface(fn=predict, inputs="text", outputs="text").launch()

How far off am I from the mark? And how do I get on track?

Currently, when I run the above script, I get the error:

TypeError: OpenAIEmbeddings.embed_query() missing 1 required positional argument: 'self'

Like mentioned earlier, I'd prefer not to use OpenAI anywhere in the script above, so any alternatives are welcomed.

Will appreciate any insights

Upvotes: 2

Views: 1487

Answers (1)

Duong Vu
Duong Vu

Reputation: 197

Yes. I have been stumbling into the same issue and there are many other alternative options.

Specifically here, you are asking for the replacement of everything OpenAI, so there are two main things from your script: Embedding model and the LLM itself.

For Embedding model, please look at langchain embedding modules list. What I suggested to use is HuggingFaceEmbeddings from langchain-huggingface package where you can specify a model from HF hub or from a local repository. Since this is your main concern, here's what the code would look like:

from langchain_huggingface import HuggingFaceEmbeddings

# Define embedding function
def extract_embedding(document: str, model_path="BAAI/bge-small-en-v1.5"):
    """Extract embedding using HF Embedding 
       with a default model from HF if no path is specify
    """

    embed_model = HuggingFaceEmbedding(model_name=model_path)
    embedding = embed_model.embed_query(text=document)
    return embedding

For the LLM model, you can use any model from this list of open llm such as T5, llama, Mistral, etc. Ollama is a great tool to pull any llm model that they support.

Upvotes: 0

Related Questions