Reputation: 2964
I have installed Solr and the Sunspot gem for my Rails 3.0 app.
My goal is to do fuzzy search. For example, I want the search term "Chatuea Marguxa" be found as "Château Margaux".
Actually, only the same exact words are found, so fuzzy didn't work at all.
My model:
searchable do
text :winery
end
My controller:
search = Wine.search do
fulltext 'Chatuea Marguxa'
end
The solr schemas I tried, with ngrams:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"/>
</analyzer>
I also tried with double metaphone:
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
</analyzer>
In both cases, I got 0 response. (after reindexing of course).
What I did wrong ?
Upvotes: 3
Views: 1201
Reputation: 882
try to add character '~'
behind all word in query. Like this: Chatuea~ Marguxa~
. This is fuzzy operator implemented in lucene: http://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Fuzzy%20Searches
Upvotes: 1
Reputation: 41874
some searching around revealed fuzzily gem:
Anecdotical benchmark: against our whole Geonames-derived table of locations (3.2M records, about 1GB of data), on my development machine (a 2011 MacBook Pro)
searching for the top 10 matching records takes 6ms ±1 preparing the index for all records takes about 10min the DB query overhead when changing a record is at 3ms ±2 the memory overhead (footprint of the trigrams table index) is about 300MB
Upvotes: 0