Boyan
Boyan

Reputation: 175

solr autocomplete keywords from multiple fields

I'm new to Solr and would like to implement an autocomplete feature based on two fields title and description. In addition the resultset should be further restricted by other fields such as id and category. Sample data:

Title: The brown fox lives in the woods
Description: The fox is found in the woods where brown leaves cover the ground. The animal's fur is brown in color and has a long tail.

Desired autocomplete result:

brown fox
brown leaves
brown color

Here are the relevant entries from schema.xml:

<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
 <analyzer type="index">
   <tokenizer class="solr.WhitespaceTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />
 </analyzer>
 <analyzer type="query">
   <tokenizer class="solr.WhitespaceTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
 </analyzer>
</fieldType>


<field name="id" type="int" indexed="true" stored="true"/>
<field name="category" type="string" indexed="true" stored="true"/>
<field name="title" type="text_general" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true"/>

<field name="ac-terms" type="autocomplete" indexed="true" stored="false" multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
<copyField source="title" dest="ac-terms"/> 
<copyField source="description" dest="ac-terms"/>

Query request:

http://localhost:9090/solr/select?q=(ac-terms:brown)

Upvotes: 2

Views: 1835

Answers (2)

Boyan
Boyan

Reputation: 175

Solved using ShingleFilterFactory with the following configuration:

<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
    <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="false"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

<field name="ac-terms" type="autocomplete" indexed="true" stored="false" multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
<copyField source="title" dest="ac-terms"/>
<copyField source="description" dest="ac-terms"/>

Query request:

http://localhost:9090/solr/select?q=&facet=true&facet.field=ac-terms&facet.prefix=brown 

Result:

brown color
brown fox
brown leaves

Hope this helps someone

Upvotes: 4

Jesvin Jose
Jesvin Jose

Reputation: 23078

What about making a field spellcheck_text and using the copy field feature so that the title and description are automatically destined to spellcheck_text?

...instruct Solr that you want it to duplicate any data it sees in the "source" field of documents that are added to the index, in the "dest" field of that document. ... The original text is sent from the "source" field to the "dest" field, before any configured analyzers for the originating or destination field are invoked.

http://wiki.apache.org/solr/SchemaXml#Copy_Fields

Upvotes: 0

Related Questions