Reputation: 95
I am querying for the word RACE
against defined field subcategory
which is of type solr.TextField
and I am getting results which have RACE
,RACING
,RACED
words in it, but I need results matching word RACE
only. Is this the default behavior of Solr
or am I doing something wrong in the configuration? Kindly suggest.
Note : I didn't put any stopwords or synonyms in their respective text files.
<field name="subcategory" type="text" indexed="true" stored="true" multiValued="false"/>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
format="wordset"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
format="wordset"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
</fieldType>
Upvotes: 0
Views: 432
Reputation: 52792
You have a stemming filter in your chain:
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
The task of a stemming filter is to reduce words to their common stem, meaning that race
, racing
, racer
, etc. all will be reduced to the same stem (probably rac
).
If you do not want stemming to be performed, remove the filter from both your index and query chain.
If you do want stemming, but only for certain queries, create a duplicated field with the analysis you want, then use copyField
to index the same content into both fields and query the field without stemming when you don't want stemming to occur.
Upvotes: 2