Reputation: 32316
I have saved the user inputs directly in elastcisearch. The name field has various spelling combinations for the same student.
PrabhuNath Prasad
PrabhuNathPrasad
Prabhu NathPrasad
Prabhu Nath Prashad
PrabhuNath Prashad
PrabhuNathPrashad
Prabhu NathPrashad
The real name of the student is "Prabhu Nath Prasad" and when I search by that name, I should get all the above results back. Is there any analyzer in elasticsearch that can take care of it?
Upvotes: 2
Views: 2082
Reputation: 12672
You could do that custom_analyzer
, This is my setup
POST name_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"char_filter": [
"space_removal"
],
"tokenizer": "keyword",
"filter": [
"lowercase",
"asciifolding"
]
}
},
"char_filter": {
"space_removal": {
"type": "pattern_replace",
"pattern": "\\s+",
"replacement": ""
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"name": {
"type": "string",
"fields": {
"variation": {
"type": "string",
"analyzer": "my_custom_analyzer"
}
}
}
}
}
}
}
I have mapped name
with both standard analyzer
and custom_analyzer
which uses keyword tokenizer
and lowercase filter
along with char_filter
which removes space and joins the string. This char_filter
will help us query different variations effectively.
I inserted all those 7 combinations you have given in index. This is my query
GET name_index/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "Prabhu Nath Prasad"
}
},
{
"match": {
"name.variation": {
"query": "Prabhu Nath Prasad",
"fuzziness": "AUTO"
}
}
}
]
}
}
}
This handles all your possibilities and it will also give back prabhu, prasad etc.
Hope this helps!!
Upvotes: 5
Reputation: 1829
There is no analyzer for that however, what you can look into is the "fuzzy"..
In your query specify the fuzziness which can help you in getting the above record.
I will Suggest you to go through the links below
https://www.elastic.co/blog/found-fuzzy-search
https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzzy-match-query.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzziness.html
This will help you achieve what you want.
Also there wont be any straight way to get the record if the user have typed "PrabhuNath", because elastic will treat it as a single token, however you can use "phrase_prefix" query which help you fetch records while the user is typing..
Your query will look like this to get the basic spelling mistake
{
"query": {
"match": {
"name": {
"query":"PrabhuNath Prasad",
"fuzziness": 2
}
}
}
}
Upvotes: 2