Retrieval-augmented generation without OpenAIEmbeddings

Question

I'm playing with HuggingFace and some of the models on there. I'm trying to achieve something along the lines of RAG. Seems like a pretty clear guide with all the needed ingredients and recipe. But the cooking sequence is what I need help.

What I want to do:

Have a question and answer chat app
But the chat app will depend on custom information given the model to answer some specialized questions.

Thus

the model will 100% reside on HuggingFace and "inferred" (I hope I'm using the term right) remotely from a VPS using HuggingFace client
and the VPS will locally have the custom documents used for the "indexing" and "fine tuning"

What I've done

With that in mind, here's what I've been able to do so far in the past couple of days (see python script below)

But the problem I keep encountering is, EVERY article/tutorial/documentation I've come across so far CANNOT write 3 sentences without mentioning something along the lines of OpenAI* sigh

I do not want to use ANYTHING OpenAI (is that even possible?)

Below is my python script.

import os
from langchain.chains import LLMChain
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
from huggingface_hub import InferenceClient

# 
# Where to import what from seems to be a whack-a-mole sport with this 
# langchain project. They can't seem to keep a module at a location
# even for a few version upgrades straight. Woow!

os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'hf_xx'
# os.environ["OPENAI_API_KEY"] = 'sk-xx' # I don't wanna use anything OpenAI*

import gradio as gr

# Load openchat model
openchat = InferenceClient(model="openchat/openchat_3.5", token='hf_xx')
documents = ["trainingData/pdfs.pdf"]

# Define embedding function
def extract_embedding(document):
    # Extract embedding using OpenAIEmbeddings
    embedding = OpenAIEmbeddings.embed_query(text=document)
    return embedding

# Load document embeddings
document_embeddings = [extract_embedding(document) for document in documents]

# Load vectorstore index
index = FAISS.from_documents(documents=documents, embeddings=document_embeddings)

# Define LLM chain
llm_chain = LLMChain(llm=openchat)

# Create RagChain manually
def predict(docs, input):
    # Retrieve relevant documents from index
    retrieved_docs = index.query(input)

    # Run LLM chain on retrieved documents
    outputs = []
    for doc in retrieved_docs:
        output = llm_chain.predict(input=doc)
        outputs.append(output)

    return outputs

# Launch Gradio app
iface = gr.Interface(fn=predict, inputs="text", outputs="text").launch()

How far off am I from the mark? And how do I get on track?

Currently, when I run the above script, I get the error:

TypeError: OpenAIEmbeddings.embed_query() missing 1 required positional argument: 'self'

Like mentioned earlier, I'd prefer not to use OpenAI anywhere in the script above, so any alternatives are welcomed.

Will appreciate any insights

Retrieval-augmented generation without OpenAIEmbeddings

Answers (1)

Related Questions