Reputation: 179
I have the following field declaration for my Solr index:
<field name="description" type="text_ci" indexed="true" multiValued="false" required="true"/>
Field type:
<fieldType name="text_ci" class="solr.TextField" omitNorms="true" sortMissingLast="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
In this index I have documents, where description value is like "Accomodation in {city}" (they all have different cities)
I want to make a fuzzy search and when I enter misspelled *acomodation*~2
for example to get results, but I find it difficult, because "accomodation" is just a part of the text.
I am thinking of using NGramFilter to tokenize the input, but I am not sure if this is the right way and how to implement it.
Do you know, what I can do?
Upvotes: 1
Views: 1647
Reputation: 8658
Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~"
, symbol at the end of a Single word Term.
I don't see a need of NGramFilter here.
~
operator is used to run fuzzy searches.
You need to add ~
operator after every single term and can also specify edit distance which is optional after that as below.
{FIELD_NAME:TERM_1~{Edit_Distance}
Your request will look like below.
http://localhost:8983/solr/FuzzySearchExample/select?indent=on&q=desc:Samsu~&wt=json&fl=id,desc
I had the field type as below.
<fieldType name="text_ci" class="solr.TextField" omitNorms="true" sortMissingLast="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I get the below response for acomodation~2
or acomodation~1
And I get the below response for acomodation
.
Upvotes: 2