Reputation: 21
Created indices with data type dense_vector => dims: 512 & uploaded json format data into elastic search, get search query with cosine similarity based on query vector [4, 3.4, -0.2] .
Referred articles https://medium.com/version-1/vector-based-semantic-search-using-elasticsearch-48d7167b38f5
created index :
put jsonsample2/
{
"settings": {"number_of_shards": 2,"number_of_replicas": 1},
"mappings":
{
"dynamic": "true","_source": {"enabled": "true"},
"properties":
{
"Document_name": {"type": "text"},
"Doc_vector": {"type": "dense_vector","dims": 512}
}
}
}
cosine similarity based on query vector Search Query:
GET jsonsample2/_search
{
"query": {
"script_score": {
"query" : {
"match_all": {}
},
"script":
{
"source": "cosineSimilarity(params.query_vector, 'Doc_vector')",
"params":
{
"query_vector": [4, 3.4, -0.2]
}
}
}
}
}
Cosine similarity search run time error. Run time error
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 1,
"skipped" : 0,
"failed" : 1,
"failures" : [
{
"shard" : 0,
"index" : "jsonsample2",
"node" : "e9WAytp6Si65x_YDRuLQcg",
"reason" : {
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [ "org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$DenseVectorFunction.getEncodedVector(ScoreScriptUtils.java:100)",
"org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$CosineSimilarity.cosineSimilarity(ScoreScriptUtils.java:179)",
"cosineSimilarity(params.query_vector, 'Doc_vector')",
" ^---- HERE"
],
"script" : "cosineSimilarity(params.query_vector, 'Doc_vector')",
"lang" : "painless",
"position" : {
"offset" : 38,
"start" : 0,
"end" : 51
},
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "A document doesn't have a value for a vector field!"
}
}
}
]
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
Upvotes: 1
Views: 3878
Reputation: 2179
the problem is described here:
"reason" : "A document doesn't have a value for a vector field!"
For checking if a document has a missing value, you can use doc['field'].size() == 0
your script should be like this:
"source": "doc['Doc_vector'].size() == 0 ? 0 : cosineSimilarity(params.query_vector, 'Doc_vector')"
Upvotes: 0