Reputation: 277
Drawing blanks as to why this is happening when using PDF pages as embedding for LangChain/OpenAI. Any help is appreciated.
Code:
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
loader = PyPDFLoader("FILENAME.pdf")
pages = loader.load_and_split()
docs = loader.load()
texts = [doc.page_content for doc in docs]
# Create embeddings
embedder = OpenAIEmbeddings()
embeddings = embedder.embed_documents(texts)
# Store in vectorstore
store = FAISS.from_documents(pages, OpenAIEmbeddings())
store.add_documents(embeddings)
Error:
AttributeError: 'list' object has no attribute 'page_content'
Upvotes: 0
Views: 1033
Reputation: 2816
You need to make slight change in your code
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
loader = PyPDFLoader("FILENAME.pdf")
docs = loader.load_and_split()
texts = [doc.page_content for doc in docs]
# Create embeddings
embedder = OpenAIEmbeddings()
embeddings = embedder.embed_documents(texts)
# Store in vectorstore
store = FAISS.from_documents(pages, OpenAIEmbeddings())
store.add_documents(embeddings)
Upvotes: 0