michael lother
michael lother

Reputation: 139

Synonym analyzer not giving results

I have the following custom analyzer defined:

{
  "analysis": {
    "analyzer": {
      "products-alike": {
        "filter": [
          "lowercase",
          "product-db"
        ],
        "tokenizer": "standard"
      }
    },
    "filter": {
      "product-db": {
        "type": "synonym",
        "synonyms": [
          "Xiaomi,Mi,Mi3,Mi4,Redmi",
          "OnePlus,OnePlusOne,OnePlus1,OnePlus2"
        ]
      }
    }
  }
}

Now I have mapped this to the required field and done the querying. But there are results only for exact matches, like if I query Xiaomi, there are results, but Mi or Mi3 won't get me any. Why is this happening and can any one help to make this working?

Upvotes: 0

Views: 45

Answers (1)

Val
Val

Reputation: 217254

You simply need to write all your synonyms in lowercase instead of CamelCase, like this:

{
  "analysis": {
    "analyzer": {
      "products-alike": {
        "filter": [
          "lowercase",
          "product-db"
        ],
        "tokenizer": "standard"
      }
    },
    "filter": {
      "product-db": {
        "type": "synonym",
        "synonyms": [
          "xiaomi,mi,mi3,mi4,redmi",
          "oneplus,oneplusone,oneplus1,oneplus2"
        ]
      }
    }
  }
}

After this, it will work, i.e. if you query for Mi3, you'll match all the synonym tokens:

curl -XGET 'localhost:9200/your_index/_analyze?analyzer=products-alike&pretty' -d 'Mi3'

Results:

{
  "tokens" : [ {
    "token" : "xiaomi",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "SYNONYM",
    "position" : 1
  }, {
    "token" : "mi",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "SYNONYM",
    "position" : 1
  }, {
    "token" : "mi3",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "SYNONYM",
    "position" : 1
  }, {
    "token" : "mi4",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "SYNONYM",
    "position" : 1
  }, {
    "token" : "redmi",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "SYNONYM",
    "position" : 1
  } ]
}

Upvotes: 2

Related Questions