Reputation: 2154
I have a set of terms that I want to map to a specific phrase at query time. For that I am using solr.SynonymFilterFactory
. Here is a snippet from schema.xml
<fieldType name="text_lc" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
</analyzer>
</fieldType>
Here is the synonyms.txt
cat, bat, mouse => small animals
Here is the output of analysis:
The problem is that small animals are appearing as separate tokens whereas I want to search on "small animals" as a whole.
How to get a multi-word synonym as a single entity in solr?
Upvotes: 2
Views: 892
Reputation: 52862
The new SynonymGraphFilter has specific functionality to handle multi word synonyms, as these weren't handled properly by the old synonym filter.
Multi word synonyms are still hard to get right, but the new filter at least has a strategy for multi word synonyms.
Example from the reference guide:
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="mysynonyms.txt"/>
<filter class="solr.FlattenGraphFilterFactory"/> <!-- required on index analyzers after graph filters -->
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="mysynonyms.txt"/>
</analyzer>
Pay attention to the FlattenGraphFilterFactory
requirement.
Upvotes: 2