user2508067
user2508067

Reputation: 95

Solr Retuning wrong results- not exact match

I am querying for the word RACE against defined field subcategory which is of type solr.TextField and I am getting results which have RACE,RACING,RACED words in it, but I need results matching word RACE only. Is this the default behavior of Solr or am I doing something wrong in the configuration? Kindly suggest.

Note : I didn't put any stopwords or synonyms in their respective text files.

 <field name="subcategory" type="text" indexed="true" stored="true" multiValued="false"/>

 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            format="wordset"
            />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>

  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            format="wordset"
            />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
</fieldType>

Upvotes: 0

Views: 432

Answers (1)

MatsLindh
MatsLindh

Reputation: 52792

You have a stemming filter in your chain:

<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>

The task of a stemming filter is to reduce words to their common stem, meaning that race, racing, racer, etc. all will be reduced to the same stem (probably rac).

If you do not want stemming to be performed, remove the filter from both your index and query chain.

If you do want stemming, but only for certain queries, create a duplicated field with the analysis you want, then use copyField to index the same content into both fields and query the field without stemming when you don't want stemming to occur.

Upvotes: 2

Related Questions