Reputation: 14709
Consider the following results from:
curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d
'{ "query" :
{"match":
{"last_name": "Smith"}
}
}'
Result:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.30685282,
"hits": [
{
"_index": "megacorp",
"_type": "employee",
"_id": "1",
"_score": 0.30685282,
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing on the weekends.",
"interests": [
"sports",
"music"
]
}
},
{
"_index": "megacorp",
"_type": "employee",
"_id": "2",
"_score": 0.30685282,
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}
}
]
}
}
Now when I execute the following query:
curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d
'{ "query" :
{"fuzzy":
{"last_name":
{"value":"Smitt",
"fuzziness": 1
}
}
}
}'
Returns NO results despite the Levenshtein distance of "Smith" and "Smitt" being 1. The same thing results with a value of "Smit." If I put in a fuzziness
value of 2, I get results. What am I missing here?
Upvotes: 0
Views: 46
Reputation: 8175
I assume that the last_name
field your are querying is an analyzed string. The indexed term will though be smith
and not Smith
.
Returns NO results despite the Levenshtein distance of "Smith" and "Smitt" being 1.
The fuzzy
query don't analyze term, so actually, your Levenshtein distance is not 1 but 2 :
Try using this mapping, and your query with fuzziness = 1 will work :
PUT /megacorp/employee/_mapping
{
"employee":{
"properties":{
"last_name":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
Hope this helps
Upvotes: 1