Reputation: 15
I have an array of tags containing words.
tags: ['australianbrownsnake', 'venomoussnake', ...]
How do I match this against these search terms: 'brown snake', 'australian snake', 'venomous', 'venomous brown snake'
I am not even sure if this is possible since I am new to Elasticsearch. Help would be appreciated. Thank you.
Edit: I have created an ngram analyzer and added a field called ngram like so.
properties": {
"tags": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
i tried the following query but no luck
"query": {
"multi_match": {
"query": "snake",
"fields": [
"tags.ngram"
],
"type": "most_fields"
}
}
my tag mapping is as follows:
"tags" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
},
"ngram" : {
"type" : "text",
"analyzer" : "my_analyzer"
}
}
},
my settings are:
{
"image" : {
"settings" : {
"index" : {
"max_ngram_diff" : "10",
"number_of_shards" : "1",
"provided_name" : "image",
"creation_date" : "1572590562106",
"analysis" : {
"analyzer" : {
"my_analyzer" : {
"tokenizer" : "my_tokenizer"
}
},
"tokenizer" : {
"my_tokenizer" : {
"token_chars" : [
"letter",
"digit"
],
"min_gram" : "3",
"type" : "ngram",
"max_gram" : "10"
}
}
},
"number_of_replicas" : "1",
"uuid" : "pO9F7W43QxuZmI9vmXfKyw",
"version" : {
"created" : "7040299"
}
}
}
}
}
Update:
This config should work fine. I believe it was my mistake. I was searching on the wrong index
Upvotes: 1
Views: 62
Reputation: 1547
You need to index your tags in the way you want to search them. For queries like 'brown snake', 'australian snake' to match your tags you would need to break them into smaller tokens.
By default elasticsearch indexes strings by passing it through its standard analyzer. You can always create your custom analyzer to store your field however you want. You can create your custom analyzer which tokenizes strings into nGrams. You can specify a size of 3-10 which will store your 'australianbrownsnake' tag as something like: ['aus', 'aust', ..., 'tra', 'tral',...]
You can then modify your search query to match on your tags.ngram
field and you should get the desired results.
tags.ngrams
field can be created like so:
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html
using ngram tokenizer:
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html
EDIT1: Elastic tends to use the analyzer of the field being matched on, to analyze the query keywords. You might not need the user query to be tokenized in nGrams since there should be a matching nGram stored in the tags field. You could specify a standard search_analyzer
in your mappings.
Upvotes: 1