Reputation: 404
I am trying to do the same prompt (query) I did on a simple pdf (legal pt-br document) using pure llama cpp python but now using llama index:
from llama_cpp import Llama
import os, re, sys
from pypdf import PdfReader
reader = PdfReader('inicial_pg10_teste.pdf') # https://files.pdfupload.io/documents/c2683560/inicial_pg10_teste.pdf
total_pages = len(reader.pages)
texto_inicial = ''
for page_num in range(total_pages):
page = reader.pages[page_num]
texto_inicial += page.extract_text()
model_name = 'mistral-br-pt-q4_k_m.gguf' # https://huggingface.co/nicolasdec/CabraMistral7b-v2/blob/quantization/mistral-br-pt-q4_k_m.gguf
llm = Llama(
model_path=f"llms/{model_name}",
n_gpu_layers=20,
n_ctx=7000,
)
response = llm.create_chat_completion(
messages = [
{
"role": "user",
"content": f"""Quem são os réus desta ação? {texto_inicial}"""
}
]
)
print(response['choices'][0]['message']['content'])
Result (correct, by the way):
Os réus neste processo trabalhista são:
Degustare e Servir Alimentação e Serviços Técnicos Ltda. (empresa de direito privado, inscrita no CNPJ nº: 17.104.821/0001 -70, com sede na Avenida do Rio Branco, nº 869, Centro, Niterói, Rio de Janeiro, CEP: 24020 -006) Secretaria de Estado de Educação do Rio de Janeiro (SEERJ) (pessoa jurídica de direito público interno, inscrita no CNPJ sob o nº 42.498.600/0001 -71, com sede na Rua Pinheiro Machado, s/n°, Palácio da Guanaba, Laranjeiras, Rio de Janeiro/RJ, CEP 22.231 -901)
But using llama_index:
import os, re, sys
from pathlib import Path
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, download_loader, ServiceContext, PromptTemplate
from llama_index.llms.llama_cpp import LlamaCPP
from llama_index.readers.file import PDFReader
loader = PDFReader()
documents = loader.load_data(file=Path('inicial_pg10_teste.pdf')) # https://files.pdfupload.io/documents/c2683560/inicial_pg10_teste.pdf
model_name = 'mistral-br-pt-q4_k_m.gguf' # https://huggingface.co/nicolasdec/CabraMistral7b-v2/blob/quantization/mistral-br-pt-q4_k_m.gguf
llm = LlamaCPP(
model_path=f"llms/{model_name}",
model_kwargs={"n_gpu_layers": 20},
context_window=7000,
)
service_context = ServiceContext.from_defaults(
llm=llm,
embed_model="local",
chunk_size=6235
)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
template = (
"<s> [INST] {query_str} {context_str} [/INST]"
)
custom_prompt = PromptTemplate(template)
query_engine = index.as_query_engine(text_qa_template=custom_prompt)
question = "Quem são os réus desta ação?"
response = query_engine.query(question)
print(response)
The result is different (and wrong):
Os réus neste processo são a Reclamada e a 2ª Reclamada.
So How can I get the same result from LlamaCPP using it in Llama-index?
Upvotes: 1
Views: 471
Reputation: 64
Kindly use some Vector Storage like pinecone or qdrant, etc. Reduce the chunk size and introduce chunk overlap. Also you can use embedding models like "thenlper/gte-large" or similar. Please let me know if by doing these changes you are able to achieve better results.
Upvotes: 0