Reputation: 55
I use default analyzer "english" for searching documents and it is pretty good. But also I need "did you mean" results when search query is misspelled OR search by such misspelled prhases.
What analyzers/filters/query do I need to achieve such behaveour?
Source text
Elasticsearch is a distributed, open source search and analytics engine for all types of data,
including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built
on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now known as Elastic).
Known for its simple REST APIs, distributed nature, speed, and scalability, Elasticsearch is
the central component of the Elastic Stack, a set of open source tools for data ingestion,
enrichment, storage, analysis, and visualization. Commonly referred to as the ELK Stack
(after Elasticsearch, Logstash, and Kibana), the Elastic Stack now includes a rich collection
of lightweight shipping agents known as Beats for sending data to Elasticsearch.
Search terms
search query => did you mean XXX?
missed letter or something like
Elastisearch => Elasticsearch
distribated => distributed
Apacje => Apache
extra space
Elastic search => Elasticsearch
no space
opensource => open source
misspelled phrase
serach engne => search engine
Upvotes: 1
Views: 1587
Reputation: 32376
Your first example of missed letter or something else can be achieved using the fuzzy query and second one using the custom analyzer which uses ngram or edge-ngram tokenizer for examples on it, please refer to my blog on autocomplete.
Adding fuzzy query example on your sample doc
Index mapping
{
"mappings": {
"properties": {
"title": {
"type": "text"
}
}
}
}
Index your sample docs and use below search queries
{
"query": {
"fuzzy": {
"title": {
"value": "distributed"
}
}
}
}
And search res
"hits": [
{
"_index": "didyou",
"_type": "_doc",
"_id": "2",
"_score": 0.89166296,
"_source": {
"title": "distribated"
}
}
]
And for Elasticsearch
{
"query": {
"fuzzy": {
"title": {
"value": "Elasticsearch"
}
}
}
}
And search Result
"hits": [
{
"_index": "didyou",
"_type": "_doc",
"_id": "1",
"_score": 0.8173577,
"_source": {
"title": "Elastisearch"
}
}
]
Upvotes: 2