BottleSpeaker
BottleSpeaker

Reputation: 15

Query to partially match every word in a search term in Elasticsearch

I have an array of tags containing words.

tags: ['australianbrownsnake', 'venomoussnake', ...]

How do I match this against these search terms: 'brown snake', 'australian snake', 'venomous', 'venomous brown snake'

I am not even sure if this is possible since I am new to Elasticsearch. Help would be appreciated. Thank you.

Edit: I have created an ngram analyzer and added a field called ngram like so.

properties": {
    "tags": {
      "type": "text",
      "fields": {
          "ngram": { 
            "type": "text",
            "analyzer": "my_analyzer"
          }
        }
    }

i tried the following query but no luck

"query": {
        "multi_match": {
          "query": "snake",
          "fields": [
            "tags.ngram"
          ],
          "type": "most_fields"
        }
      }

my tag mapping is as follows:

        "tags" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            },
            "ngram" : {
              "type" : "text",
              "analyzer" : "my_analyzer"
            }
          }
        },

my settings are:

{
  "image" : {
    "settings" : {
      "index" : {
        "max_ngram_diff" : "10",
        "number_of_shards" : "1",
        "provided_name" : "image",
        "creation_date" : "1572590562106",
        "analysis" : {
          "analyzer" : {
            "my_analyzer" : {
              "tokenizer" : "my_tokenizer"
            }
          },
          "tokenizer" : {
            "my_tokenizer" : {
              "token_chars" : [
                "letter",
                "digit"
              ],
              "min_gram" : "3",
              "type" : "ngram",
              "max_gram" : "10"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "pO9F7W43QxuZmI9vmXfKyw",
        "version" : {
          "created" : "7040299"
        }
      }
    }
  }
}

Update:

This config should work fine. I believe it was my mistake. I was searching on the wrong index

Upvotes: 1

Views: 62

Answers (1)

Archit Saxena
Archit Saxena

Reputation: 1547

You need to index your tags in the way you want to search them. For queries like 'brown snake', 'australian snake' to match your tags you would need to break them into smaller tokens.

By default elasticsearch indexes strings by passing it through its standard analyzer. You can always create your custom analyzer to store your field however you want. You can create your custom analyzer which tokenizes strings into nGrams. You can specify a size of 3-10 which will store your 'australianbrownsnake' tag as something like: ['aus', 'aust', ..., 'tra', 'tral',...]

You can then modify your search query to match on your tags.ngram field and you should get the desired results. tags.ngrams field can be created like so:

https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

using ngram tokenizer:

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html

EDIT1: Elastic tends to use the analyzer of the field being matched on, to analyze the query keywords. You might not need the user query to be tokenized in nGrams since there should be a matching nGram stored in the tags field. You could specify a standard search_analyzer in your mappings.

Upvotes: 1

Related Questions