Moulali Shaik
Moulali Shaik

Reputation: 141

Elastic search results are not as expected

I've a field indexed with custom analyzer with the below configuration

 "COMPNAYNAME" : {
          "type" : "text",
          "analyzer" : "textAnalyzer"
        }

 "textAnalyzer" : {
              "filter" : [
                "lowercase"
              ],
              "char_filter" : [ ],
              "type" : "custom",
              "tokenizer" : "ngram_tokenizer"
            }

 "tokenizer" : {
            "ngram_tokenizer" : {
              "type" : "ngram",
              "min_gram" : "2",
              "max_gram" : "3"
            }
          }

While I'm searching for a text "ikea" I'm getting the below results

Query :

GET company_info_test_1/_search
{
  "query": {
    "match": {
      "COMPNAYNAME": {"query": "ikea"}
    }
  }
}

Fallowing are the results,

1.mikea
2.likeable
3.maaikeart
4.likeables
5.ikea b.v.  <------
6.likeachef
7.ikea breda <------
8.bernikeart
9.ikea duiven
10.mikea media

I'm expecting the exact match result should be boosted more than the rest of the results. Could you please help me what is the best way to index if I have to search with exact match as well as with fizziness.

Thanks in advance.

Upvotes: 3

Views: 95

Answers (1)

Bhavya
Bhavya

Reputation: 16172

You can use ngram tokenizer along with "search_analyzer": "standard" Refer this to know more about search_analyzer

As pointed out by @EvaldasBuinauskas you can also use edge_ngram tokenizer here, if you want the tokens to be generated from the beginning only and not from the middle.

Adding a working example with index data, mapping, search query, and result

Index Data:

{ "title": "ikea b.v."}
{ "title" : "mikea" }
{ "title" : "maaikeart"}

Index Mapping

{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "my_tokenizer"
                }
            },
            "tokenizer": {
                "my_tokenizer": {
                    "type": "ngram",
                    "min_gram": 2,
                    "max_gram": 10,
                    "token_chars": [
                        "letter",
                        "digit"
                    ]
                }
            }
        },
        "max_ngram_diff": 50
    },
    "mappings": {
        "properties": {
            "title": {
                "type": "text",
                "analyzer": "my_analyzer",
                "search_analyzer": "standard"
            }
        }
    }
}

Search Query:

{
    "query": {
        "match" : {
            "title" : "ikea"
        }
    }
}

Search Result:

"hits": [
            {
                "_index": "normal",
                "_type": "_doc",
                "_id": "4",
                "_score": 0.1499838,    <-- note this
                "_source": {
                    "title": "ikea b.v."
                }
            },
            {
                "_index": "normal",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.13562363,    <-- note this
                "_source": {
                    "title": "mikea"
                }
            },
            {
                "_index": "normal",
                "_type": "_doc",
                "_id": "3",
                "_score": 0.083597526,
                "_source": {
                    "title": "maaikeart"
                }
            }
        ]

Upvotes: 2

Related Questions