NeatNerd
NeatNerd

Reputation: 2373

Solr Query Syntax exact match

I have a field configured like

    <fieldType name="gtext" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <!--Needed for efficient trailling wildcard queries-->
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" side="front"/>
    <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
         maxPosAsterisk="2" maxPosQuestion="1" minTrailing="2" maxFractionAsterisk="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.WordDelimiterFilterFactory"
            generateWordParts="1"
            generateNumberParts="1"
            catenateWords="0"
            catenateNumbers="1"
            stemEnglishPossessive="1"               
            catenateAll="0"
            preserveOriginal="1"
            />
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="1"
                catenateWords="0"
                catenateNumbers="1"
                stemEnglishPossessive="1"               
                catenateAll="0"
                preserveOriginal="1"
                />
    </analyzer>
</fieldType>

So when I search for example fun, it will also return funny. How can I avoid this behavior and have only fun matched? Is it because of reverse wildcards?

Upvotes: 0

Views: 1105

Answers (1)

Jayendra
Jayendra

Reputation: 52769

This is cause of the EdgeNGramFilterFactory filter

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" side="front"/>

EdgeNGramFilterFactory generates edge grams for the token e.g.

funny would generate -> f, fu, fun, funn, funny .....

So when you search for fun, documents with funny would match

ReversedWildcardFilterFactory does not cause this issue, it will only enhance the prefix query search.

for e.g. funny would be stored as ynnuf

And prefix queries *nny would be converted to ynn* which is more good for performance.

Upvotes: 2

Related Questions