rahul
rahul

Reputation: 606

In elastic search how to query place name using fuzzy search

I have inserted locations in elastic search, below is the sample elastic search stored locations:

[
  {
    "lat": 1,
    "lon": 1,
    "place": "asddda ddsd asdad vasanth hhjkhk sdsdd asddasd"
  },
  {
    "lat": 2,
    "lon": 2,
    "place": "asddda ddsd asdad vasanth1 hhjkhk sdsdd asddasd"
  },
  {
    "lat": 3,
    "lon": 3,
    "place": "asddda ddsd asdad vasanth2 hhjkhk sdsdd asddasd"
  },
  {
    "lat": 4,
    "lon": 4,
    "place": "asddda ddsd asdad test hhjkhk sdsdd asddasd"
  }
]

If i search for vasanth it is giving correct result, as it should give all 3 result. But if i search with charcter deletion Vsanth it is giving only one result but now also it should give 3 result. And also, if I insert a charcter then I also. Not working properly.

According to elastic search doc, it fuzzy query has the below features:

Changing a character (box → fox)
Removing a character (black → lack)
Inserting a character (sic → sick)
Transposing two adjacent characters (act → cat)

Below is the query I am using,

{
    "query": {
        "fuzzy": {
            "address": {
                "value": "Vsanth",
                "fuzziness":15,
                "transpositions":true,
                 "boost": 5
            }
        }
    }
}

So, how can I modify the query to use all four features of fuzzy query. I'm not getting what mistake I have done.

Upvotes: 0

Views: 896

Answers (2)

Gibbs
Gibbs

Reputation: 22964

Problem is that you have more edit distance to match vasanth1 and vasanth2.

Reference

The fuzziness parameter can be specified as:

0, 1, 2

0..2 = Must match exactly

3..5 = One edit allowed

More than 5 = Two edits allowed

And you specified fuzziness as 15 so only two edits are allowed which is max in es.

So the problem here is that your query requires 3 edit distance which is not supported.

Why 3:

Vsanth --> vsanth --> vasanth --> vasanth1

  1. Case changes -> capital V to lowercase v
  2. Addition of a -> vasanth
  3. one more Addition -> vasanth1

Hence you are getting the only vasanth matched docs.

And fuzzy queries are term queries, they will not be analysed. Adding lowercase filter to your place field's analyzer will not help here.

Upvotes: 1

Bhavya
Bhavya

Reputation: 16172

You can use the "fuzzy" operator to have fuzzy searching in query_string:

This uses the Damerau-Levenshtein distance to find all terms with a maximum of two changes, where a change is the insertion, deletion or substitution of a single character, or transposition of two adjacent characters.

To get a detailed explanation, refer to this official documentation

Adding a working example including Search query and Search result, taking the same sample index data as mentioned in the question

Search Query:

    {
  "query": {
    "query_string": {
      "query": "Vsanth~"
    }
  }
}

Search Result:

"hits": [
  {
    "_index": "foo",
    "_type": "_doc",
    "_id": "1",
    "_score": 1.0033107,
    "_source": {
      "lat": 1,
      "lon": 1,
      "place": "asddda ddsd asdad vasanth hhjkhk sdsdd asddasd"
    }
  },
  {
    "_index": "foo",
    "_type": "_doc",
    "_id": "2",
    "_score": 0.8026485,
    "_source": {
      "lat": 2,
      "lon": 2,
      "place": "asddda ddsd asdad vasanth1 hhjkhk sdsdd asddasd"
    }
  },
  {
    "_index": "foo",
    "_type": "_doc",
    "_id": "3",
    "_score": 0.8026485,
    "_source": {
      "lat": 3,
      "lon": 3,
      "place": "asddda ddsd asdad vasanth2 hhjkhk sdsdd asddasd"
    }
  }
]

Upvotes: 1

Related Questions