Reputation: 595
Is there a way to require a complete (though not necessarily exact) match in ElasticSearch?
For instance, if a field has the term "I am a little teapot short and stout"
, I would like to match on " i am a LITTLE TeaPot short and stout! "
but not just "teapot short and stout"
. I've tried the term filter, but that requires an actual exact match.
Upvotes: 1
Views: 191
Reputation: 52368
If your "not necessarily exact" definition refers to uppercase/lowercase letters combination and the punctuation marks (like !
you have in your example), this would be a solution, not too simple and obvious tough:
The mapping:
{
"settings": {
"analysis": {
"analyzer": {
"my_keyword_lowercase": {
"tokenizer": "keyword",
"filter": [
"lowercase",
"trim",
"my_pattern_replace"
]
}
},
"filter": {
"my_pattern_replace": {
"type": "pattern_replace",
"pattern": "!",
"replacement":""
}
}
}
},
"mappings": {
"test": {
"properties": {
"text": {
"type": "string",
"analyzer": "my_keyword_lowercase"
}
}
}
}
}
The idea here is the following:
keyword
tokenizer to keep the text as is and not to be split into tokenslowercase
filter to get rid of the mixing uppercase/lowercase characterstrim
filter used to get rid of the trailing and leading whitespacespattern_replace
filter to get rid of the punctuation. This is like this because a keyword
tokenizer won't do anything to the characters inside the text. A standard
analyzer will do this, but the standard
will, also, split the text whereas you need it as isAnd this is the query you would use for the mapping above:
{
"query": {
"match": {
"text": " i am a LITTLE TeaPot short and stout! "
}
}
}
Upvotes: 1