Reputation: 2198
I am running a rails application sunspot solr and a table for cities with a column named name. There is a city named 'Emmendingen'. I get results for 'Emmendi' 'Emmendin' 'Emmendige' but not for the name itself 'Emmendingen'.
In the model I search like this
search(:include => :geo_name_admin_one_code) do
any do
fulltext(q, :fields=> [:name])
fulltext(q, :fields=> [:alternate_name])
end
with(:feature_class, 'P')
order_by(:population,:desc)
limit(10)
end
My config looks like this
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="10"/>
<filter class="solr.ReversedWildcardFilterFactory" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="10"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
So how can I match the exact name?
Upvotes: 0
Views: 171
Reputation: 2198
I solved it with this config
<!-- *** This fieldType is used by Sunspot! *** -->
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
Upvotes: 1
Reputation: 9789
Your tokenized indexed form cannot have more than 10 characters. You have trimmed it twice, once with NGrams and once with EdgeNGrams (which looks very wrong).
Your query does no trimming, so your 11 word character does not match anything.
The easiest way to troubleshoot that on your own is the Analysis screen in the Admin UI, where you can enter both index and query strings and see what happens and whether they would match.
Upvotes: 1