Errol Green
Errol Green

Reputation: 1387

Elastic search partial match but strict phrase matching

I'm looking for a way to fuzzy partial match against a field where the words match, however I want to also add in strict phrase matching.

i.e. say I have fields such as

foo bar
bar foo

I would like to achieve the following search behaviour:

I would also like to add in single character fuzziness matching, so if a foo is mistyped as fbo then it would return back both results.

My current search and index analyzer uses an edge_gram tokenizer and is working fairly well, except if any gram matches, it will return the results regardless if the following words match. i.e. my search would return the back the following result for the search bar foo buzz

foo bar
bar foo

My tokenzier:

ngram_tokenizer: {
   type: "edge_ngram",
   min_gram: "2",
   max_gram: "15",
   token_chars: ['letter', 'digit', 'punctuation', 'symbol'],
},
          

My analyzer:

nGram_analyzer: {
  filter: [
  lowercase,
    "asciifolding"
  ],
  type: "custom",
  tokenizer: "ngram_tokenizer"
},

My field mapping:


type: "search_as_you_type",
doc_values: false,
max_shingle_size: 3,
analyzer: "nGram_analyzer"
          

Upvotes: 0

Views: 401

Answers (1)

Bhavya
Bhavya

Reputation: 16172

One way to achieve all your requirements is to use span_near query

Span near query are much longer, but these are suitable for doing phrase match along with fuzziness parameter

Adding a working example with index data, search queries and search results

Index Mapping:

{
  "mappings": {
    "properties": {
      "title": {
        "type": "text"
      }
    }
  }
}

Index Data:

{
    "title":"bar foo"
}
{
    "title":"foo bar"
}

Search Queries:

If I search foo, I would like to return back both results.

{
  "query": {
    "bool": {
      "must": [
        {
          "span_near": {
            "clauses": [
              {
                "span_multi": {
                  "match": {
                    "fuzzy": {
                      "title": {
                        "value": "foo",
                        "fuzziness": 2
                      }
                    }
                  }
                }
              }
            ],
            "slop": 0,
            "in_order": true
          }
        }
      ]
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "67205552",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.18232156,
        "_source": {
          "title": "bar foo"
        }
      },
      {
        "_index": "67205552",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.18232156,
        "_source": {
          "title": "foo bar"
        }
      }
    ]

If I search ba, I would like to return back both results.

{
  "query": {
    "bool": {
      "must": [
        {
          "span_near": {
            "clauses": [
              {
                "span_multi": {
                  "match": {
                    "fuzzy": {
                      "title": {
                        "value": "ba",
                        "fuzziness": 2
                      }
                    }
                  }
                }
              }
            ],
            "slop": 0,
            "in_order": true
          }
        }
      ]
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "67205552",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.18232156,
        "_source": {
          "title": "bar foo"
        }
      },
      {
        "_index": "67205552",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.18232156,
        "_source": {
          "title": "foo bar"
        }
      }
    ]

If I search bar foo foo, I don't want to return any results.

{
  "query": {
    "bool": {
      "must": [
        {
          "span_near": {
            "clauses": [
              {
                "span_multi": {
                  "match": {
                    "fuzzy": {
                      "title": {
                        "value": "bar",
                        "fuzziness": 2
                      }
                    }
                  }
                }
              },
              {
                "span_multi": {
                  "match": {
                    "fuzzy": {
                      "title": {
                        "value": "foo",
                        "fuzziness": 2
                      }
                    }
                  }
                }
              },
              {
                "span_multi": {
                  "match": {
                    "fuzzy": {
                      "title": {
                        "value": "foo",
                        "fuzziness": 2
                      }
                    }
                  }
                }
              }
            ],
            "slop": 0,
            "in_order": true
          }
        }
      ]
    }
  }
}

Search Result will be empty

Upvotes: 2

Related Questions