felipeformenti
felipeformenti

Reputation: 177

Elastic Search KNN Semantic Search with pre stored embeddings returing the same score for every hit

I was following this documentation page from Elastic Search https://www.elastic.co/guide/en/elasticsearch/reference/current/bring-your-own-vectors.html

I have stored the vectors already and tried to query the documents against another vector embedding but every hit returns the same score.

Here's my mapping inside a create_index function

def create_index(es_client, index_name, dim):
    # Define index settings and mappings
    mapping = {
        "mappings": {
            "properties": {
                "page_id": {"type": "keyword"},
                "chunk_id": {"type": "keyword"},
                "embedding": {
                    "index": True,  # set to true to enable the use of the knn query
                    "type": "dense_vector",
                    "dims": dim,
                    "similarity": "cosine",
                },
            }
        }
    }

    if es_client.indices.exists(index=index_name):
        print(f"Index {index_name} already exists.")
    else:
        es_client.indices.create(index=index_name, body=mapping)
        print(f"Index {index_name} created successfully.")

And here's the query wrapped in a function

def search_similar_documents(user_embedding, top_k=3):
    """
    Perform a k-NN semantic search in Elasticsearch.

    :param user_embedding: List of floats representing the user's query embedding.
    :param top_k: Number of similar documents to retrieve.
    :return: List of document IDs.
    """

    query = {
        "knn": {
            "field": "embedding",
            "query_vector": user_embedding,
            "k": top_k,  
            "num_candidates": 100, 
        },
        "_source": ["page_id", "chunk_id"], 
    }

    return es_client.search(index=ELASTICSEARCH_INDEX_NAME, body=query)

response = search_similar_documents(human_input_embedded, top_k=20)

All 20 hits have the same score!

I don't have the text field on my mapping. Could that be the issue? I have the text field stored in a Neo4J for another purpose.

Upvotes: 0

Views: 17

Answers (0)

Related Questions