Reputation: 21002
I want to be able to return useful records if a user searches for a keyword that is very, very common in a solr index. For example education
.
In this case, close to 99% of the records would have that word in it. So searches for this word or similar take a long time.
This is for solr on ColdFusion but I'm open to solutions which are isolated to just solr.
Right now I'm thinking of coming up with a list of stopwords and preventing those searches from taking place altogether.
Upvotes: 3
Views: 461
Reputation: 28718
If the user searches on just one term that is exceedingly common then you need to limit your results and advise the user that there were too many matches.
In the more general case, you want to perform a two-pass (at least) approach. Take your search terms and perform a lookup to determine their 'common-ness'. You want to filter based on least common terms first, and more common terms last.
For example, user searches serendipitous education
. You identify that you have 11 matches for serendipitous
, and 900000 matches for education
. Thus you apply the serendipitous
filter first, resulting in 11 matches. Then apply the education
filter, resulting in 7 matches.
The key to fast searching is indexing and precomputed statistics. If you have statistics like this on hand you can dynamic create an optimised approach.
Upvotes: 0
Reputation: 12485
If searches are taking a long time, it could be because you are not limiting the number of results that are returned. The <cfsearch>
tag has a maxrows
attribute, as well as a startrow
attribute, that you could use to limit or paginate the data. Alternately, you could call Solr's web service directly through a <cfhttp>
call:
<cfhttp url="http://localhost:8983/solr/<collection_name>/select/?q=<searchterm>&fl=*,score&rows=100&wt=json" />
Solr will return 10 rows by default; you can change this with the rows
parameter. You can use the start
parameter as well (note that Solr starts counting with 0 instead of 1). I believe this solution is more flexible, especially if you're using CF 9, as it allows you to paginate while sorting on a field other than score.
You can find more detail here: http://www.thefaberfamily.org/search-smith/coldfusion-solr-tutorial/
Upvotes: 2