Josh Harrison
Josh Harrison

Reputation: 434

Elastic search exact match

I'm using elasticsearch and am having a devil of a time getting an exact match to happen. I've tried various combinations of match, query_string, etc, and I either get nothing or bad results. Query looks like this:

{
  "filter": {
    "term": {
      "term": "dog",
      "type": "main"
    }
  },
  "query": {
    "match_phrase": {
      "term": "Dog"
    }
  },
  "sort": [
    "_score"
  ]
}

Sorted results

10.102211 {u'term': u'The Dog', u'type': u'main', u'conceptid': 7730506}
10.102211 {u'term': u'That Dog', u'type': u'main', u'conceptid': 4345664}
10.102211 {u'term': u'Dog', u'type': u'main', u'conceptid': 144}
7.147442 {u'term': u'Dog Eat Dog (song)', u'type': u'main', u'conceptid': u'5288184'}

I see, of course that "The Dog", "That Dog" and "Dog" all have the same score, but I need to figure out how I can boost the exact match "Dog" in score.

I also tried

{
  "sort": [
    "_score"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "term": "Dog"
          }
        },
        {
          "match_phrase": {
            "term": {
              "query": "Dog",
              "boost": 5
            }
          }
        }
      ]
    }
  },
  "filter": {
    "term": {
      "term": "dog",
      "type": "main"
    }
  }
}

but that still just gives me

11.887239 {u'term': u'The Dog', u'type': u'main', u'conceptid': 7730506}
11.887239 {u'term': u'That Dog', u'type': u'main', u'conceptid': 4345664}
11.887239 {u'term': u'Dog', u'type': u'main', u'conceptid': 144}
8.410372 {u'term': u'Dog Eat Dog (song)', u'type': u'main', u'conceptid': u'5288184'}

Upvotes: 14

Views: 19644

Answers (3)

Danny Yang
Danny Yang

Reputation: 1

Hash two value which you need to search into hash key, then search it.

Upvotes: -1

runarM
runarM

Reputation: 1621

Fields are analyzed with the standard analyzer by default. If you would like to check exact match, you could store your field not analyzed also e.g:

"dog":{
            "type":"multi_field",
            "fields":{
                "dog":{
                    "include_in_all":false,
                    "type":"string",
                    "index":"not_analyzed",
                    "store":"no"
                },
                "_tokenized":{
                    "include_in_all":false,
                    "type":"string",
                    "index":"analyzed",
                    "store":"no"
                }
            }
        }

Then you can query the dog-field for exact matches, and dog._tokenized for analyzed queries (like fulltext)

Upvotes: 14

moliware
moliware

Reputation: 10288

I think that your problem is that field term is being analyzed (check your mapping) with the standard analyzer and is filtering stopwords such as the or that. For that reason you get the same score for Dog and The Dog. So maybe you can solve your problem by configuring a custom analyzer => documentation page

Upvotes: 0

Related Questions