Reputation: 177
I was following this documentation page from Elastic Search https://www.elastic.co/guide/en/elasticsearch/reference/current/bring-your-own-vectors.html
I have stored the vectors already and tried to query the documents against another vector embedding but every hit returns the same score.
Here's my mapping inside a create_index function
def create_index(es_client, index_name, dim):
# Define index settings and mappings
mapping = {
"mappings": {
"properties": {
"page_id": {"type": "keyword"},
"chunk_id": {"type": "keyword"},
"embedding": {
"index": True, # set to true to enable the use of the knn query
"type": "dense_vector",
"dims": dim,
"similarity": "cosine",
},
}
}
}
if es_client.indices.exists(index=index_name):
print(f"Index {index_name} already exists.")
else:
es_client.indices.create(index=index_name, body=mapping)
print(f"Index {index_name} created successfully.")
And here's the query wrapped in a function
def search_similar_documents(user_embedding, top_k=3):
"""
Perform a k-NN semantic search in Elasticsearch.
:param user_embedding: List of floats representing the user's query embedding.
:param top_k: Number of similar documents to retrieve.
:return: List of document IDs.
"""
query = {
"knn": {
"field": "embedding",
"query_vector": user_embedding,
"k": top_k,
"num_candidates": 100,
},
"_source": ["page_id", "chunk_id"],
}
return es_client.search(index=ELASTICSEARCH_INDEX_NAME, body=query)
response = search_similar_documents(human_input_embedded, top_k=20)
All 20 hits have the same score!
I don't have the text field on my mapping. Could that be the issue? I have the text field stored in a Neo4J for another purpose.
Upvotes: 0
Views: 17