Reputation: 9568
We're using ElasticSearch completion suggester with the Standard Analyzer, but it seems like the text is not tokenized.
e.g.
Texts: "First Example", "Second Example"
Search: "Fi" returns "First Example"
While
Search: "Ex" doesn't return any result returns "First Example"
Upvotes: 3
Views: 3459
Reputation: 29
One approach to hack in the suggestions from every position of the string could be to shingle the string, take only the shingles with position 0, from every shingle take the last token.
PUT example
{
"settings": {
"index.max_shingle_diff": 10,
"analysis": {
"filter": {
"after_last_space": {
"type": "pattern_replace",
"pattern": "(.* )",
"replacement": ""
},
"preserve_only_first": {
"type": "predicate_token_filter",
"script": {
"source": "token.position == 0"
}
},
"big_shingling": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 10,
"output_unigrams": true
}
},
"analyzer": {
"dark_magic": {
"tokenizer": "standard",
"filter": [
"lowercase",
"big_shingling",
"preserve_only_first",
"after_last_space"
]
}
}
}
},
"mappings": {
"properties": {
"suggest": {
"type": "completion",
"analyzer": "dark_magic",
"search_analyzer": "standard"
}
}
}
}
This hack works for short strings (up to 10 tokens in the example).
Upvotes: 0
Reputation: 2996
A great work around is to tokenize the string yourself and put it in a separate tokens field. You can then use 2 suggestions in your suggest query to search both fields.
Example:
PUT /example
{
"mappings": {
"doc": {
"properties": {
"full": {
"type": "completion"
},
"tokens": {
"type": "completion"
}
}
}
}
}
POST /example/doc/_bulk
{ "index":{} }
{"full": {"input": "First Example"}, "tokens": {"input": ["First", "Example"]}}
{ "index":{} }
{"full": {"input": "Second Example"}, "tokens": {"input": ["Second", "Example"]}}
POST /example/_search
{
"suggest": {
"full-suggestion": {
"prefix" : "Ex",
"completion" : {
"field" : "full",
"fuzzy": true
}
},
"token-suggestion": {
"prefix": "Ex",
"completion" : {
"field" : "tokens",
"fuzzy": true
}
}
}
}
Search result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": 0,
"hits": []
},
"suggest": {
"full-suggestion": [
{
"text": "Ex",
"offset": 0,
"length": 2,
"options": []
}
],
"token-suggestion": [
{
"text": "Ex",
"offset": 0,
"length": 2,
"options": [
{
"text": "Example",
"_index": "example",
"_type": "doc",
"_id": "Ikvk62ABd4o_n4U8G5yF",
"_score": 2,
"_source": {
"full": {
"input": "First Example"
},
"tokens": {
"input": [
"First",
"Example"
]
}
}
},
{
"text": "Example",
"_index": "example",
"_type": "doc",
"_id": "I0vk62ABd4o_n4U8G5yF",
"_score": 2,
"_source": {
"full": {
"input": "Second Example"
},
"tokens": {
"input": [
"Second",
"Example"
]
}
}
}
]
}
]
}
}
Upvotes: 1
Reputation: 2412
As the doc of Elastic about completion suggester: Completion Suggester
The completion suggester is a so-called prefix suggester.
So when you send a keyword, it will look for the prefix of your texts.
E.g:
Search: "Fi" => "First Example"
Search: "Sec" => "Second Example"
but if you give Elastic "Ex", it returns nothing because it cannot find a text which begins with "Ex".
You can try some others suggesters like: Term Suggester
Upvotes: 3