Reputation: 141
I've a field indexed with custom analyzer with the below configuration
"COMPNAYNAME" : {
"type" : "text",
"analyzer" : "textAnalyzer"
}
"textAnalyzer" : {
"filter" : [
"lowercase"
],
"char_filter" : [ ],
"type" : "custom",
"tokenizer" : "ngram_tokenizer"
}
"tokenizer" : {
"ngram_tokenizer" : {
"type" : "ngram",
"min_gram" : "2",
"max_gram" : "3"
}
}
While I'm searching for a text "ikea" I'm getting the below results
Query :
GET company_info_test_1/_search
{
"query": {
"match": {
"COMPNAYNAME": {"query": "ikea"}
}
}
}
Fallowing are the results,
1.mikea
2.likeable
3.maaikeart
4.likeables
5.ikea b.v. <------
6.likeachef
7.ikea breda <------
8.bernikeart
9.ikea duiven
10.mikea media
I'm expecting the exact match result should be boosted more than the rest of the results. Could you please help me what is the best way to index if I have to search with exact match as well as with fizziness.
Thanks in advance.
Upvotes: 3
Views: 95
Reputation: 16172
You can use ngram tokenizer along with
"search_analyzer": "standard"
Refer this to know more about search_analyzer
As pointed out by @EvaldasBuinauskas you can also use edge_ngram tokenizer here, if you want the tokens to be generated from the beginning only and not from the middle.
Adding a working example with index data, mapping, search query, and result
Index Data:
{ "title": "ikea b.v."}
{ "title" : "mikea" }
{ "title" : "maaikeart"}
Index Mapping
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "standard"
}
}
}
}
Search Query:
{
"query": {
"match" : {
"title" : "ikea"
}
}
}
Search Result:
"hits": [
{
"_index": "normal",
"_type": "_doc",
"_id": "4",
"_score": 0.1499838, <-- note this
"_source": {
"title": "ikea b.v."
}
},
{
"_index": "normal",
"_type": "_doc",
"_id": "1",
"_score": 0.13562363, <-- note this
"_source": {
"title": "mikea"
}
},
{
"_index": "normal",
"_type": "_doc",
"_id": "3",
"_score": 0.083597526,
"_source": {
"title": "maaikeart"
}
}
]
Upvotes: 2