LearningRoR
LearningRoR

Reputation: 27192

Getting fuzzy searching to work for Sunspot?

I have in my database or Solr index the following 2 Products: Total War: Shogun 2 [Download] and Eggs.

What I want the search to be able to do is match these 2 Products with mistakes e.g:

"Egggs", "Eggz", "Eg", "Egs" and "Shogn Download", "Totle War","Tutal War: Shogunn 2 Download" etc.

EDIT ( Working somewhat):

This will get you started, still having issues with using different characters inside of a search though i.e. Only things like "Eggs" and "Great Value Vitamin D Whole Milk" can be misspelled not "Total War: Shogun 2".

New code:

<fieldType name="text" class="solr.TextField" omitNorms="false">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.WordDelimiterFilterFactory" stemEnglishPossessive="1" splitOnNumerics="1" splitOnCaseChange="1" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1"/>
            <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="50" side="front"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.WordDelimiterFilterFactory" stemEnglishPossessive="1" splitOnNumerics="1" splitOnCaseChange="1" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1"/>
            <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>

The Ideal is to be able have my search like Googles where it does a pretty good job of correcting your spelling whether lowercase, uppercase and with a couple of errors. How would I make my search similar to what Google does?

Upvotes: 0

Views: 865

Answers (1)

Jayendra
Jayendra

Reputation: 52769

Fuzzy searches do not undergo query time analysis.
So there are chances that you query does not match the index terms.

The terms in the above config, undergo lower case filtering during indexing, which would store all the terms in lower case.
And searching for Egggs would never produce any results, as Egggs would not match eggs. The searched terms need to be lowercased explicitly.

Also, in the above config, the index time analysis is very different from query time analysis.
Its usually recommended to have similiar filters during query and index, so that the indexed terms match the searched terms.

solr.PorterStemFilterFactory may result into a completely different root for the searched term and may never match the indexed terms.

Revisit your configuration. Maybe check the example solr schema xml for reference.

Upvotes: 2

Related Questions