Timeless
Timeless

Reputation: 7527

elasticsearch synonyms not working as expected

The text I'm trying to search is 2 marina blvd, the result returned by elasticsearch (top 3) are:

2 MARINA GREEN, SINGAPORE 019800
MARINA BAYFRONT 2 RAFFLES LINK, SINGAPORE 039392
THE SAIL @ MARINA BAY 2 MARINA BOULEVARD, SINGAPORE 018987

In my synonyms list, blvd is same as boulevard.

When I search 2 marina blvd, I'm expecting this THE SAIL @ MARINA BAY 2 MARINA BOULEVARD, SINGAPORE 018987 will be the one at the top with highest score, since 2 marina blvd equals to 2 marina boulevard. But now 2 MARINA GREEN, SINGAPORE 019800 is on top.

What went wrong, how can I improve my search result?

The full settings are:

{
  "geolocation": {
    "settings": {
      "index": {
        "creation_date": "1471322099847",
        "analysis": {
          "filter": {
            "my_synonym_filter": {
              "type": "synonym",
              "synonyms": [
                "rd,road",
                "ave,avenue",
                "blvd,boulevard",
                "st,street",
                "lor,lorong",
                "ter,terminal",
                "blk,block",
                "apt,apartment",
                "condo,condominium"
              ]
            }
          },
          "analyzer": {
            "my_synonyms": {
              "filter": [
                "lowercase",
                "my_synonym_filter"
              ],
              "tokenizer": "standard"
            },
            "stopwords_analyzer": {
              "type": "standard",
              "stopwords": [
                "the"
              ]
            },
            "my_ngram_analyzer": {
              "tokenizer": "my_ngram_tokenizer"
            }
          },
          "tokenizer": {
            "my_ngram_tokenizer": {
              "token_chars": [
                "letter",
                "digit"
              ],
              "min_gram": "2",
              "type": "nGram",
              "max_gram": "5"
            }
          }
        },
        "number_of_shards": "5",
        "number_of_replicas": "1",
        "uuid": "mPfZmWHFQZOHqfAi471nGQ",
        "version": {
          "created": "2030599"
        }
      }
    }
  }
}

And this is the query

body: {
      from : 0, size : 10,
      query: {
        bool: {
          should: [
            {
              match: {
                text: q
              }
            },
            {
              match: {
                text: {
                  query: q,
                  fuzziness: 1,
                  prefix_length: 0,
                  max_expansions: 100
                }
              }
            },
            {
              match: {
                text: {
                  query: q,
                  max_expansions: 300,
                  type: "phrase_prefix"
                }
              }
            }
          ]
        }
      }
    }

And the mapping is:

{
  "geolocation": {
    "mappings": {
      "location": {
        "properties": {
          "address": {
            "type": "string"
          },
          "blk": {
            "type": "string"
          },
          "building": {
            "type": "string"
          },
          "location": {
            "type": "geo_point"
          },
          "postalCode": {
            "type": "string"
          },
          "road": {
            "type": "string"
          },
          "searchText": {
            "type": "string"
          },
          "x": {
            "type": "string"
          },
          "y": {
            "type": "string"
          }
        }
      }
    }
  }
}

Upvotes: 1

Views: 1262

Answers (1)

Andrei Stefan
Andrei Stefan

Reputation: 52368

You defined analyzers but you haven't set any of them for your fields. The most basic setup would be:

"searchText": {
  "type": "string",
  "analyzer":"my_synon‌​yms"
}

One field can have one analyzer for indexing time and one at searching time. Most of the use cases usually use the same analyzer at indexing and searching time. By default (when using "analyzer": "whatever_analyzer"‌​) the same analyzer is used at searching and indexing time.

To get more insight into analysis and what you can do with, please consult https://www.elastic.co/guide/en/elasticsearch/guide/2.x/analysis-intro.html.

Upvotes: 1

Related Questions