Reputation: 1338
I have a document with a field called 'countryCode'. I have a term query that search for the keyword value of it. But having some issues with:
Can I instruct my index to handle all those variations somehow, instead of me having to expand the terms on my query filter?
Upvotes: 0
Views: 764
Reputation: 8840
What you are looking for is a way to have your tokens understand similar tokens which may or may not be having similar characters. This is only possible using synonyms
.
Elasticsearch
provides you to configure your synonyms
and have your query use those synonyms
and return the results accordingly.
I have configured a field using a custom analyzer
using synonym token filter
. I have created a sample mapping and query so that you can play with it and see if that fits your needs.
PUT my_index
{
"settings": {
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": [
"usa, us",
"uk, gb"
]
}
},
"analyzer": {
"my_synonyms": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_synonym_filter"
]
}
}
}
},
"mappings": {
"mydocs": {
"properties": {
"name": {
"type": "text",
"analyzer": "my_synonyms"
}
}
}
}
}
POST my_index/mydocs/1
{
"name": "uk is pretty cool country"
}
And when you make use of the below query, it does return the above document as well.
GET my_index/mydocs/_search
{
"query": {
"match": {
"name": "gb"
}
}
}
Refer to their official documentation to understand more on this. Hope this helps!
Upvotes: 1
Reputation: 3018
Handling within ES itself without using logstash, I'd suggest using a simple ingest pipeline with gsub processor to update the field in it's place
{
"gsub": {
"field": "countryCode",
"pattern": "GB",
"replacement": "UK"
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/master/gsub-processor.html
Upvotes: 0