user2260040
user2260040

Reputation: 1380

SOLR wildcard search not returning results

I have a schema definition as follows:

<fieldType name="textSuggest" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <filter class="solr.PatternReplaceFilterFactory" pattern="([,]+)" replacement=" " replace="all"/>      
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>          
    </analyzer>
    <analyzer type="query">
      <filter class="solr.PatternReplaceFilterFactory" pattern="([,]+)" replacement=" " replace="all"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>  

And some data in the format:

17,WALKINGTON,AVENUE,,MARGARET RIVER,WA

If I search for 17 walkington, it shows the above in the results. How can I make sure that if I search for 17 walk, the above shows up in the search results? I have tried appending * at the end of the search query, but can't get it to work. Any suggestions?

Upvotes: 1

Views: 73

Answers (1)

Abhijit Bashetti
Abhijit Bashetti

Reputation: 8678

In order to get the partial word match you have to change or add the ngram filter.

Try using ngram filter.

Factory class: solr.NGramFilterFactory

for example the Arguments of it:

minGramSize: (integer, default 1) The minimum n-gram size, must be > 0.

maxGramSize: (integer, default 2) The maximum n-gram size, must be >= minGramSize.

Example you can a field type for your field:

<fieldType name="textSuggest" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <filter class="solr.PatternReplaceFilterFactory" pattern="([,]+)" replacement=" " replace="all"/>      
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="10"/>
      <filter class="solr.LowerCaseFilterFactory"/>          
    </analyzer>
    <analyzer type="query">
      <filter class="solr.PatternReplaceFilterFactory" pattern="([,]+)" replacement=" " replace="all"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>  

Note : The ngram causes big number of tokens and hence large index size if you have huge data set.

Upvotes: 1

Related Questions