Nikhita Kanoj
Nikhita Kanoj

Reputation: 11

AttributeError: 'Document' object has no attribute 'get_doc_id'

My Application : load CSV file into knowledge graph(using KnowledgeGraphIndex )and use LLM(HuggingFaceH4/zephyr-7b-beta) to retrieve answers from graph store(SimpleGraphStore).

My Problem : I want to pass multiple CSV files into knowledge graph , I am using CSVLoader , when i run knowledgeGraphIndex , I am getting this error :AttributeError: 'Document' object has no attribute 'get_doc_id'

This is how I am laoding CSV :

`from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()

splitter = CharacterTextSplitter(separator = "\n",
                                chunk_size=500, 
                                chunk_overlap=0,
                                length_function=len)
documents = splitter.split_documents(data)`

And this is my KnowledgeGraphIndex :

`index = KnowledgeGraphIndex.from_documents(
   documents,
 storage_context=storage_context,
   include_embeddings=True,
   max_triplets_per_chunk=2,
   embed_model=embed_model,

)``

Upvotes: 1

Views: 2099

Answers (1)

meshkati
meshkati

Reputation: 2403

Try using Document from langchain.docstore, and creating document with this class:

from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document
from llama_index.core import KnowledgeGraphIndex

csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()
documents = []
splitter = CharacterTextSplitter(separator="\n",
                                 chunk_size=500,
                                 chunk_overlap=0,
                                 length_function=len)
docs = splitter.split_documents(data)

# Assign a unique identifier to each document
for i, doc in enumerate(documents):
    new_doc = Document(
        page_content=doc.page_content,    )
    documents.append(new_doc)

index = KnowledgeGraphIndex.from_documents(
    documents,
    include_embeddings=True,
    max_triplets_per_chunk=2,
)

Upvotes: 0

Related Questions