Reputation: 29
I just want to use tf-idf score on some 'analyzed' field and use 'term' on 'not_analyzed' field to sortout preferred results. But the results is not as what I expect.
According to official documents, 'not_analyzed' field will not be analyzed, which I take it as that the es will not do score calculation on those fields. So I want to take advantage of this to sortout what I want, because I want to use tf-idf score on specific field to do more calculation, but the scores vary when I add term condition. I have tried 3 steps: 1. do 'match' on analyzed field, and that score is what I want 2. concatenate 'match' and 'term' which is on not_analyzed field, but the returning score is a bit higher that those of 1st step 3. do only 'term' on 'not_analyzed' field, and es return score.
Part of the code has been shown below, and these are 4 entries of data:
data = {"did": 1, "title": "hu la la", "test": ["a", "b", "c"]}
data = {"did": 2, "title": "hu la", "test": ["a", "b", "c"]}
data = {"did": 3, "title": "hu la la", "test": ["a", "b"]}
data = {"did": 4, "title": "la la", "test": ["a", "b", "c"]}
mappings = {
"properties": {
"did": {"type": "long", "index": "not_analyzed"},
"title": {"type": "string", "index": "analyzed"},
"test": {"type": "string", "index": "not_analyzed"},
}
}
curl -X GET http://localhost:9200/test7/_search?pretty=true -d '
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "la"
}
}
]
}
}
}
'
one of the hits is that
{
"_index" : "test7",
"_type" : "default",
"_id" : "AWoRGrIx5vn17yswf0rR",
"_score" : 0.4203996,
"_source" : {
"did" : 1,
"test" : [ "a", "b", "c" ],
"title" : "hu la la"
}
but when I add term
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "la"
}
},
{
"term": {
"test": "a"
}
}
]
}
}
}
'
its score changed!
{
"_index" : "test7",
"_type" : "default",
"_id" : "AWoRGrIx5vn17yswf0rR",
"_score" : 0.7176671,
"_source" : {
"did" : 1,
"test" : [ "a", "b", "c" ],
"title" : "hu la la"
}
Upvotes: 0
Views: 157
Reputation: 515
You should use a filter query to filter out results, this will not affect the score.
Example:
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "la"
}
}
],
"filter": [
{
"term": {
"test": "a"
}
}
]
}
}
}
Upvotes: 1