user2966197
user2966197

Reputation: 2991

Llama-index how to execute search query against OpenSearch Elasticsearch index?

I have this code where I am able to create an index in Opensearch Elasticsearch:

def openes_initiate(file):
    

    endpoint = getenv("OPENSEARCH_ENDPOINT", "http://localhost:9200")
    # index to demonstrate the VectorStore impl
    idx = getenv("OPENSEARCH_INDEX", "llama-osindex-demo")
    
    UnstructuredReader = download_loader("UnstructuredReader")

    loader = UnstructuredReader()
    documents = loader.load_data(file=Path(file))

    # OpensearchVectorClient stores text in this field by default
    text_field = "content"
    # OpensearchVectorClient stores embeddings in this field by default
    embedding_field = "embedding"
    # OpensearchVectorClient encapsulates logic for a
    # single opensearch index with vector search enabled
    client = OpensearchVectorClient(endpoint, idx, 1536, embedding_field=embedding_field, text_field=text_field)
    # initialize vector store
    vector_store = OpensearchVectorStore(client)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    # initialize an index using our sample data and the client we just created
    index = GPTVectorStoreIndex.from_documents(documents=documents,storage_context=storage_context)

Issue I am having is that once I have indexed the data, I am unable to reload it and serve a query against it. I tried to do this:

def query(index,question):
    query_engine = index.as_query_engine()
    res = query_engine.query(question)
    print(res.response)

Where index is the one I created in first piece of code, but it returns None

Upvotes: 2

Views: 1858

Answers (3)

Yash Sing
Yash Sing

Reputation: 45

source

Assuming you have initiated an OpenSearch dashboard, this is how you would typically go with the loading:

To initialize your index:

# using default values
endpoint = f"https://{user}:{password}@{hostname}"
idx = "sample-index"
text_field = "content"
embedding_field = "embedding"
client = OpensearchVectorClient(
    endpoint, idx, dim=1536, embedding_field=embedding_field, text_field=text_field
)

This initiates an empty index in your OpenSearch Database.

To store document embeddings in the index, use:

vector_store = OpensearchVectorStore(client)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# using a simple VectorStoreIndex
index = VectorStoreIndex.from_documents(
    documents=documents, storage_context=storage_context
)

This will populate your index.

Now, to load the contents from the populated index:

vector_index = VectorStoreIndex.from_vector_store(
    vector_store = vector_store
)

This will load the contents from the vector store into your index from where you'll be able to use it as a query engine, retriever or chat engine.

Upvotes: 0

ExistMe
ExistMe

Reputation: 529

I think when storing you should use something like this:

service_context = ServiceContext.from_defaults(
        llm=None,   
        embed_model= your_embedding_model
)
index = VectorStoreIndex.from_documents(
        documents=documents, 
        storage_context=storage_context,
        service_context=service_context
)

After your data has been embedded, to retrieve it, you'll need to get the vector store from the OpensearchVectorClient. Here's a snippet that can help you with that:

Given:

client = OpensearchVectorClient(endpoint, idx, 1536, 
                                embedding_field=embedding_field, 
                                text_field=text_field)
vector_store = OpensearchVectorStore(client)

Get VectorStoreIndex from the vector_store:

vsi = VectorStoreIndex.from_vector_store(vector_store,                                       
                                         service_context=service_context)
query_engine = vsi.as_query_engine()
res = query_engine.query("your question")
print(res)

This should assist you in retrieving and querying your embedded data.

Upvotes: 0

Sacheen Shah
Sacheen Shah

Reputation: 1

you need to create open-search client and load index with VectorStoreIndex.from_vector_store() before you can run query on it,

index object is null which will not generate null result.

Upvotes: 0

Related Questions