gringogordo
gringogordo

Reputation: 2130

exact match in elasticSearch after incorporating hunspell filter

We have added the hunspell filter to our elastic search instance. Nothing fancy...

{
"index" : {
    "analysis" : {
        "tokenizer" : {
            "comma" : {
                "type" : "pattern",
                "pattern" : ","
            }
        },            
          "filter": {
            "en_GB": {
              "type": "hunspell",
              "language": "en_GB"
            }
          },
        "analyzer" : {
            "comma" : {
                "type" : "custom",
                "tokenizer" : "comma"
            },
            "en_GB": {
              "filter": [
                "lowercase",
                "en_GB"
              ],
              "tokenizer": "standard"
            }
        }        
    }
}
}

Now though we seem to have lost the built in facility to do exact match queries using quotation marks. So searching for "lace" will also do an equal score search for "lacy" for example. I understand this is kind of the point of including hunspell but I would like to be able to force exact matches by using quotes

I am doing boolean queries for this by the way. Along the lines of (in java)

"bool" : {
    "must" : {
      "query_string" : {
        "query" : "\"lace\"",
        "fields" : 
        ...

or (postman direct to 9200 ...

{
"query" : { 
  "query_string" : {
    "query" : "\"lace\"",
    "fields" :
....

Is this possible ? I'm guessing this might be something we would do in the tokaniser but I'm not quite sure where to start...?

Upvotes: 0

Views: 202

Answers (1)

user3775217
user3775217

Reputation: 4803

You will not be able to handle this tokenizer level, but you can tweak configurations at mapping level to use multi-fields, you can keep a copy of the same field which will not be analyzed and later use this in query to support your usecase.

You can update your mappings like following

"mappings": {
        "desc": {
        "properties": {
           "labels": {
              "type": "string",
              "analyzer": "en_GB",
              "fields": {
              "raw": { 
              "type":  "keyword"
              }
            }
           }
        }
     }
    }

Furthur modify your query to search on raw field instead of analyzed field.

{
    "query": {
        "bool": {
            "must": [{
                "query_string": {
                    "default_field": "labels.raw",
                    "query": "lace"
                }
            }]
        }
    }
}

Hope this helps Thanks

Upvotes: 1

Related Questions