Traeyee
Traeyee

Reputation: 29

Why I 'term query' with not_analyzed field and es still return score?

I just want to use tf-idf score on some 'analyzed' field and use 'term' on 'not_analyzed' field to sortout preferred results. But the results is not as what I expect.

According to official documents, 'not_analyzed' field will not be analyzed, which I take it as that the es will not do score calculation on those fields. So I want to take advantage of this to sortout what I want, because I want to use tf-idf score on specific field to do more calculation, but the scores vary when I add term condition. I have tried 3 steps: 1. do 'match' on analyzed field, and that score is what I want 2. concatenate 'match' and 'term' which is on not_analyzed field, but the returning score is a bit higher that those of 1st step 3. do only 'term' on 'not_analyzed' field, and es return score.

Part of the code has been shown below, and these are 4 entries of data:

data = {"did": 1, "title": "hu la la", "test": ["a", "b", "c"]}

data = {"did": 2, "title": "hu la", "test": ["a", "b", "c"]}

data = {"did": 3, "title": "hu la la", "test": ["a", "b"]}

data = {"did": 4, "title": "la la", "test": ["a", "b", "c"]}

mappings = {
    "properties": {
        "did": {"type": "long", "index": "not_analyzed"},
        "title": {"type": "string", "index": "analyzed"},
        "test": {"type": "string", "index": "not_analyzed"},
    }
}
curl -X GET http://localhost:9200/test7/_search?pretty=true -d '
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "la"
                    }
                }
            ]
        }
    }
}
'

one of the hits is that

{
      "_index" : "test7",
      "_type" : "default",
      "_id" : "AWoRGrIx5vn17yswf0rR",
      "_score" : 0.4203996,
      "_source" : {
        "did" : 1,
        "test" : [ "a", "b", "c" ],
        "title" : "hu la la"
      }

but when I add term

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "la"
                    }
                },
                {
                    "term": {
                        "test": "a"
                    }
                }
            ]
        }
    }
}
'

its score changed!

{
      "_index" : "test7",
      "_type" : "default",
      "_id" : "AWoRGrIx5vn17yswf0rR",
      "_score" : 0.7176671,
      "_source" : {
        "did" : 1,
        "test" : [ "a", "b", "c" ],
        "title" : "hu la la"
      }

Upvotes: 0

Views: 157

Answers (1)

aHochstein
aHochstein

Reputation: 515

You should use a filter query to filter out results, this will not affect the score.

Example:

 {
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "la"
                    }
                }               
            ],
            "filter": [
                 {
                    "term": {
                        "test": "a"
                    }
                }
            ]
        }
    }
}

Upvotes: 1

Related Questions