Reputation: 3689
I've got a apache SOLR web application. I'm saving all the queries entered in my database and I index the and query string and query string count to a Suggestion core.
Here is the format
<doc>
<str name="id">superman</str>
<long name="searchCount_l">10</long> //superman has been queried 10 times
<doc>
<doc>
<str name="id">superman movie</str>
<long name="searchCount_l">30</long> //superman movie has been queried 30 times
<doc>
Configuration:
<searchComponent name="suggest" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.WFSTLookupFactory</str>
<str name="buildOnCommit">true</str>
<str name="field">id</str>
</lst>
</searchComponent>
If the user types in 'sup' I want "superman movie" to be the first one in the autosuggest list.
I've looked at implementing a comparatorClass
public class MySuggestionComparator implements Comparator<SuggestWord>
but SuggestWord class only stores freq, score and string value and not the value of custom searchCount_l field.
Questions:
Should I implement a Custom search handler which queries the Suggestion core and boost on searchCount_l field. But is this a good approach for Autosuggest ? Would it effect the speed if I use a Custom search request handler then using the given suggest component ?
Is there a configuration for solr.SpellCheckComponent that I can use to achieve this ?
What filters are currently being used in solr.SpellCheckComponent ?
Upvotes: 0
Views: 836
Reputation: 52799
You can check for the following alternatives :-
Use the Normal search with edgegrams filter to generate tokens.
As you are already maintaining the count, you can search and sort on the count.
This will have a index which will grow as the query needs to be stored but would perform fast.
Else, just index each search term as a separate document field, do not store the queries.
You can then use the facet components and facet.prefix query to retrieve the search suggestions.
The count will be taken care by itself by the facet count sorting by default.
The performance would be fast as well the index size would be limited.
Upvotes: 1