Reputation: 123
I am trying to test the spellchecking functionality with Solr 4.7.2 using solr.DirectSolrSpellChecker (where you don't need to build a dedicated index).
I have a field named "title" in my index; I used a copy field definition to create a field named "title_spell" to be queried for the spellcheck (title_spell is correctly filled). However, in the admin solr admin console, I always get empty suggesions.
For example: I have a solr document with the title "A B automobile"; I enter in the admin console (spellcheck crossed and under the input field spellcheck.q) "atuomobile". I expect to get at least something like "A B automobile" or "automobile" but the spellcheck suggestion remains empty...
My configuration:
schema.xml (only relevant part copied):
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StandardFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="de_DE/synonyms.txt" ignoreCase="true"
expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StandardFilterFactory"/>
</analyzer>
</fieldType>
...
<field name="title_spell" type="textSpell" indexed="true" stored="true" multiValued="false"/>
solr.xml (only relevant part copied):
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">title_spell</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">4</int>
<float name="maxQueryFrequency">0.01</float>
<float name="thresholdTokenFrequency">.01</float>
</lst>
</searchComponent>
...
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
</lst>
<!--Versuch, das online datum mit in die Gewichtung zu nehmen...-->
<lst name="appends">
<str name="bf">recip(ms(NOW/MONTH,sort_date___d_i_s),3.16e-11,50,1)</str>
<!--<str name="qf">title___td_i_s_gcopy^1e-11</str>-->
<str name="qf">title___td_i_s_gcopy^21</str>
<str name="q.op">AND</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
What did I miss? Thanks for your answers!
Upvotes: 0
Views: 1187
Reputation: 1183
How large is your index? For a small index (think less than a few million docs), you're going to have to tune accuracy
, maxQueryFrequency
, and thresholdTokenFrequency
. (Actually, it would probably be worth doing this on larger indices as well.)
For example, my 1.5 million doc index uses the following for these settings:
<float name="maxQueryFrequency">0.01</float>
<float name="thresholdTokenFrequency">.00001</float>
<float name="accuracy">0.5</float>
accuracy
tells Solr how accurate a result needs to be before it's considered worth returning as a suggestion.
maxQueryFrequency
tells Solr how frequently the term needs to occur in the index before it's can be considered worth returning as a suggestion.
thresholdTokenFrequency
tells Solr what percentage of documents the term must be included in before it's considered worth returning as a suggestion.
If you plan to use spellchecking on multiple phrases, you may need to add a ShingleFilter
to your title_spell
field.
Another thing you might try is setting your queryAnalyzerFieldType
to title_spell
.
Upvotes: 2
Reputation: 13402
Can you please try editing your requestHandler
declaration.
<requestHandler name="/standard" class="solr.SearchHandler" default="true">
and query url as:
http://localhost:8080/solr/service/standard?q=<term>&qf=title_spell
First experiment with small terms and learn how it is behaving. One problem here is it will only return all the terms starting with the same query term
. You can use FuzzyLookupFactory
which will match and return fuzzy result. For more information check solr suggester wiki.
Upvotes: 0