javaNoober
javaNoober

Reputation: 1338

Elasticsearch - Do searches for alternative country codes

I have a document with a field called 'countryCode'. I have a term query that search for the keyword value of it. But having some issues with:

Can I instruct my index to handle all those variations somehow, instead of me having to expand the terms on my query filter?

Upvotes: 0

Views: 764

Answers (2)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

What you are looking for is a way to have your tokens understand similar tokens which may or may not be having similar characters. This is only possible using synonyms.

Elasticsearch provides you to configure your synonyms and have your query use those synonyms and return the results accordingly.

I have configured a field using a custom analyzer using synonym token filter. I have created a sample mapping and query so that you can play with it and see if that fits your needs.

Mapping

PUT my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym_filter": {
          "type": "synonym",
          "synonyms": [
            "usa, us",
            "uk, gb"
          ]
        }
      },
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_synonym_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "mydocs": {
      "properties": {
        "name": {
          "type": "text",
          "analyzer": "my_synonyms"
        }
      }
    }
  }
}

Sample Document

POST my_index/mydocs/1
{
  "name": "uk is pretty cool country"
}

And when you make use of the below query, it does return the above document as well.

Query

GET my_index/mydocs/_search
{
  "query": {
    "match": {
      "name": "gb"
    }
  }
}

Refer to their official documentation to understand more on this. Hope this helps!

Upvotes: 1

ben5556
ben5556

Reputation: 3018

Handling within ES itself without using logstash, I'd suggest using a simple ingest pipeline with gsub processor to update the field in it's place

{
  "gsub": {
  "field": "countryCode",
  "pattern": "GB",
  "replacement": "UK"
  }
}

https://www.elastic.co/guide/en/elasticsearch/reference/master/gsub-processor.html

Upvotes: 0

Related Questions