Bert
Bert

Reputation: 11

FAISS vectorstore created with LangChain yields AttributeError: 'OpenAIEmbeddings' object has no attribute 'deployment' / 'headers'

For the past few weeks I have been working at a QA retrieval chatbot project with LangChain and OpenAI in Python. I have an ingest pipepline set up in a notebook on Google Colab, with which I have been extracting text from PDFs, creating embeddings and storing into FAISS vectorstores, that I would then use to test my LangChain chatbot (a Streamlit python app). I have a bunch of vectorstores (one per PDF) that I have created in the past few days.

The Google Colab pipeline simply takes the extracted PDF pages, creates LangChain documents, and finally embeds them and saves the vectorstore with the follwing code

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)
with open("file.pkl", "wb") as f:
    pickle.dump(vectorstore, f)

I would then manually download the FAISS vectorstore file.pkl, store it on my local machine in a db folder that my Streamlit app can access as follows:

if os.path.exists(f"db/{filename}.pkl"):
   with open(f"db/{filename}.pkl", "rb") as f:
       vectorstore = pickle.load(f)

Since today (Monday 3 July) any new FAISS vectorstore that I create with my Google Colab notebook would not be loaded in my app. I would get an exception error saying that the variable "vectorstore" was not defined.

I thought I'd try downloading the notebook and creating the vectorstore locally, but the result was the same.

Alas, I had not been paying attention to what versions of lanchain and openai were being installed every time I'd run my Colab notebook. Fearing that it might be due to some update, I made sure both my Google Colab notebook and my local environment are the same:

langchain==0.0.205
openai==0.27.8
streamlit==1.22.0
faiss-cpu==1.7.4
tiktoken==0.4.0

Now the vectorstore gets loaded in the app, but I get the following error:

AttributeError: 'OpenAIEmbeddings' object has no attribute 'deployment'

If I create the vectorstore from the same notebook on my local machine, I get the following error:

AttributeError: 'OpenAIEmbeddings' object has no attribute 'headers'

Updating to the latest versions of langchain and openai does not help. I tried downgrading the langchain version, but eventually I reach one that no longer supports gpt-3.5-turbo-16k (the model used in my app) and I get a different kind of error when running my app.

Nothing else has changed, my app launches fine, the vectorstores I had created in the past few days work fine. Just any new vectorstores that I create no longer work.

What could have happened?

Upvotes: 1

Views: 11848

Answers (1)

hopeonthestreet
hopeonthestreet

Reputation: 21

I found this resource: https://dagster.io/blog/training-llms

In order to generate the VectorStore and save it as a pkl file, they run the following:

from langchain.vectorstores.faiss import FAISS
from langchain.embeddings import OpenAIEmbeddings
import pickle

@asset
def vectorstore(documents):
    vectorstore_contents = FAISS.from_documents(documents, OpenAIEmbeddings())
    with open("vectorstore.pkl", "wb") as f:
        pickle.dump(vectorstore_contents, f)

Subsequently, (having saved the pkl file locally) they read their pkl file as a Langchain VectorStore object. I've tried this and it loaded the pkl object as a VectorStore object with all of its attributes.

from langchain.vectorstores import VectorStore
import pickle

vectorstore_file = "vectorstore.pkl"

with open(vectorstore_file, "rb") as f:
    global vectorstore
    local_vectorstore: VectorStore = pickle.load(f)

Hope this helps!

Upvotes: 1

Related Questions