Reputation: 53563
I've looked through a ton of examples and other questions here and from them, I've got my config very close to what I need but I'm missing one last little bit that I'm having a heck of a time working out. I'm searching on values like:
solar powered
solar glass
solar globe
solar lights
solar magic
solid brass
solid copper
What I want:
sol
the result should include all these values. This works.solar
I should get just the first five. This works.solar gl
I should get only solar glass
and solar globe
. This does not work. Instead, I get one set of matches for solar
and a second set of matches for gl
.In a nutshell, I want to consider the input string as a whole, regardless of any whitespace. I gather this is accomplished by creating a separate query (versus index) analyzer, but I've not been able to make it work. Can anyone suggest a configuration that will get me what I'm looking for?
I've (unsuccessfully) tried:
"solar gl"
mm=100%
Here's my current schema:
<field name="suggest_phrase" type="suggest_phrase"
indexed="true" stored="false" multiValued="false" />
And the field definition:
<fieldType name="suggest_phrase" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
And the config:
<searchComponent name="suggest_phrase" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">suggest_phrase</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup</str>
<str name="field">suggest_phrase</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest_phrase">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest_phrase</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.collate">false</str>
</lst>
<arr name="components">
<str>suggest_phrase</str>
</arr>
</requestHandler>
Upvotes: 17
Views: 17211
Reputation: 14830
You may use the AnalyzingInfixLookupFactory
or FreeTextLookupFactory
More details and other suggester algorithms you will find here: http://alexbenedetti.blogspot.de/2015/07/solr-you-complete-me.html
Solr Configuration
<lst name="suggester">
<str name="name">AnalyzingInfixSuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">title</str>
<str name="weightField">price</str>
<str name="suggestAnalyzerFieldType">text_en</str>
</lst>
<lst name="suggester">
<str name="name">FreeTextSuggester</str>
<str name="lookupImpl">FreeTextLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">title</str>
<str name="ngrams">3</str>
<str name="separator"> </str>
<str name="suggestFreeTextAnalyzerFieldType">text_general</str>
</lst>
Upvotes: 2
Reputation: 53563
Found the answer, finally! I knew I was really close. Turns out my configuration above was correct and I simply needed to change my query.
KeywordTokenizerFactory
so that the strings get indexed as a whole.SpellCheckComponent
for the request handler.q=<string>
but with spellcheck.q=<string>
.Given the source strings noted above and a query of spellcheck.q=solar+gl
this yields the desired results:
solar glass
solar globe
Upvotes: 17
Reputation: 4284
I've tried this many times and I came to the conclusion that is not possible out of the box. I found a workaround for that:
I indexed the data adding sopecial chars between each word so that they would not be tokenized. For example:
solarzzzzzzpowered
solarzzzzzzglass
solarzzzzzzglobe
then when you compose your query you make sure you add the same amount of chars between the two words you type, for example solr gl
become solarzzzzzzgl
.
This will achieve the behavious that you are asking.
Another option would be not to use the autosuggestion field and make a custom field for yourself, but then you will have to manage the wildcard search and all the indexation by yourself and is not too convenient in terms of time and performance.
Upvotes: 0