Muralidhar
Muralidhar

Reputation: 21

Elastic search query with cosine similarity run time error?

Created indices with data type dense_vector => dims: 512 & uploaded json format data into elastic search, get search query with cosine similarity based on query vector [4, 3.4, -0.2] .

Referred articles https://medium.com/version-1/vector-based-semantic-search-using-elasticsearch-48d7167b38f5

created index :

put jsonsample2/
{
    "settings": {"number_of_shards": 2,"number_of_replicas": 1},
    "mappings": 
    {
        "dynamic": "true","_source": {"enabled": "true"},
        "properties": 
        {
            "Document_name": {"type": "text"},
            "Doc_vector": {"type": "dense_vector","dims": 512}
        }
    }
}

cosine similarity based on query vector Search Query:

GET jsonsample2/_search
{
  "query": {
    "script_score": {
      "query" : {
        "match_all": {}
        },
      
      "script": 
      {
        "source": "cosineSimilarity(params.query_vector, 'Doc_vector')", 
        "params": 
        {
          "query_vector": [4, 3.4, -0.2]  
        }
      }
    }
    }
}

Cosine similarity search run time error. Run time error

    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 2,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 1,
        "failures" : [
          {
            "shard" : 0,
            "index" : "jsonsample2",
            "node" : "e9WAytp6Si65x_YDRuLQcg",
            "reason" : {
              "type" : "script_exception",
              "reason" : "runtime error",
              "script_stack" : [                   "org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$DenseVectorFunction.getEncodedVector(ScoreScriptUtils.java:100)",
                "org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$CosineSimilarity.cosineSimilarity(ScoreScriptUtils.java:179)",
                "cosineSimilarity(params.query_vector, 'Doc_vector')",
                "                                      ^---- HERE"

              ],
              "script" : "cosineSimilarity(params.query_vector, 'Doc_vector')",
              "lang" : "painless",
              "position" : {
                "offset" : 38,
                "start" : 0,
                "end" : 51
              },
              "caused_by" : {
                "type" : "illegal_argument_exception",
                "reason" : "A document doesn't have a value for a vector field!"
              }
            }
          }
        ]
      },
      "hits" : {
        "total" : {
          "value" : 0,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      }
    }

Upvotes: 1

Views: 3878

Answers (1)

hamid bayat
hamid bayat

Reputation: 2179

the problem is described here:

"reason" : "A document doesn't have a value for a vector field!"

For checking if a document has a missing value, you can use doc['field'].size() == 0

your script should be like this:

"source": "doc['Doc_vector'].size() == 0 ? 0 : cosineSimilarity(params.query_vector, 'Doc_vector')"

Upvotes: 0

Related Questions