Reputation: 23
I am following this tutorial
I am using a sample pdf from here
But I replaced OpenAI with Huggingface for the embeddings
Below is my code:
import os
import pickle
from pprint import pprint
from PyPDF2 import PdfReader
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
pdf = 'sample.pdf'
pdf_reader = PdfReader(pdf)
text = ''
for page in pdf_reader.pages:
text += page.extract_text()
pprint(text)
text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 100,
chunk_overlap = 20,
length_function = len
)
chunks = text_splitter.split_text(text = text)
embeddings = HuggingFaceEmbeddings()
VectorStore = FAISS.from_texts(chunks, embedding = embeddings)
with open('sample.pkl', 'wb') as f:
pickle.dump(VectorStore, f)
When i check the sample.pkl file, all i see is this line: No module named 'langchain'
I also checked the embeddings without saving it to the file, and i can see them
Im using jupyter notebook, with python 3.9.16, and I have all the libraries installed. And YES I do have langchain installed. I wouldnt be able to import FAISS or HuggingFaceEmbeddings or RecursiveCharacterTextSplitter without it
Upvotes: 0
Views: 2007
Reputation: 23
So I tried it in another computer with the same environment, and had the same error.
Turns out I was running the jupyter notebook and python script in python 3.9.16
, but the .pkl
file that was being saved was using the default computer version of python 3.11.1
.
So all I had to do was install all the required libraries on my default 3.11.1
version, and everything worked perfectly
Upvotes: 0
Reputation: 943
I can see you didn't enter any repo_id, model_kwargs, huggingfacehub_api_token parameters inside HuggingFaceHub(). Make sure you installed -> pip install langchain
Upvotes: 0