Elasticsearch - Do searches for alternative country codes

Question

I have a document with a field called 'countryCode'. I have a term query that search for the keyword value of it. But having some issues with:

Some records saying UK and some other saying GB
Some records saying US and some other USA
And the list goes on..

Can I instruct my index to handle all those variations somehow, instead of me having to expand the terms on my query filter?

Kamal Kunjapur · Accepted Answer

What you are looking for is a way to have your tokens understand similar tokens which may or may not be having similar characters. This is only possible using synonyms.

Elasticsearch provides you to configure your synonyms and have your query use those synonyms and return the results accordingly.

I have configured a field using a custom analyzer using synonym token filter. I have created a sample mapping and query so that you can play with it and see if that fits your needs.

Mapping

PUT my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym_filter": {
          "type": "synonym",
          "synonyms": [
            "usa, us",
            "uk, gb"
          ]
        }
      },
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_synonym_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "mydocs": {
      "properties": {
        "name": {
          "type": "text",
          "analyzer": "my_synonyms"
        }
      }
    }
  }
}

Sample Document

POST my_index/mydocs/1
{
  "name": "uk is pretty cool country"
}

And when you make use of the below query, it does return the above document as well.

Query

GET my_index/mydocs/_search
{
  "query": {
    "match": {
      "name": "gb"
    }
  }
}

Refer to their official documentation to understand more on this. Hope this helps!

Elasticsearch - Do searches for alternative country codes

Answers (2)

Mapping

Sample Document

Query

Related Questions