Philipp
Philipp

Reputation: 4270

Elasticsearch - Autocomplete return word/term/token suggestions instead of whole documents

I am trying to implement a simple auto completion for query terms. There are many different approaches but most of them do return documents instead of terms - or the authors simply stopped explaining from that point and i am not able to adapt.

A user is typing in a query - e.g. phil What i want is to provide a list of term completion suggestions like philipp, philius, philadelphia, ...

I am able to get document matches via (edge)ngrams, phrase_prefix and so on but i am am stuck at retrieving matching terms (completion suggestions).

Can someone give me a hint?

I have documents like this {"title":"...", "description":"...", "content":"..."} All fields have larger string values but especially the field content contains fulltext content.

I do not want to suggest the whole title of a document containing e.g. Philadelphia. Just the word "Philadelphia".

Upvotes: 4

Views: 1121

Answers (2)

Anton
Anton

Reputation: 53

Try term suggester:

The term suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The term suggester doesn’t take the query into account that is part of request.

Upvotes: 0

bitterman0
bitterman0

Reputation: 31

Looking for something like that, myself.

In SOLR it was relatively simple to configure (although a pain to build and keep up-to-date) using solr.SpellCheckComponent. Somehow the same underlying Lucene functionality is used differently between SOLR and ElasticSearch, and in ElasticSearch it is geared towards finding whole documents (or whole field values, if you will) or so it seems...

Despite the profusion of "elasticsearch autocomplete" articles, none appears to deal with this particular issue. Like it doesn't exist. Maybe their use case is different and ElasticSearch works for them just fine, who knows?

At this point I think that preparing the exact field values to use with ElasticSearch autocomplete (yes, that's the input field values, not analyzer tokens) maybe the only way to solve the problem. Which is terrible, because the performance is going to be very low.

Upvotes: 1

Related Questions