Reputation: 55
I've looked around for a solution to no avail, but I'd imagine there a way to do this.
We've got a SOLR implementation with 30 fields or so, each with boost values associated. Some fields are equally weighted - most have differing values.
We'd like to boost a document score if multiple terms are hit within a given field vs. across equally weighted fields.
Example: Searching for Computer Programming
If Computer Programming appears in the same field of a document, I'd like that to score higher, than if 'Computer' appears in one field, and 'Programming' appears in another. Our current configuration is scoring them equally (assuming the fields are weighted equally).
I think this may involve using the phrase slop and proximity, however I'm hoping that there's a way manage this another way?
Upvotes: 4
Views: 4765
Reputation: 303
This can be accomplished by using a Boost Query (bq) with a regex query. For example, in my application I boost matches where exactName or exactSynonym starts with the query string by adding the bq parameter
bq:(exactname:/<your_lucene_escaped_query_string_here>.*/) OR (exactSynonyms:/<your_lucene_escaped_query_string_here>.*/)
This is how i escape lucene special chars:
escapeLucene: function (value, addQuotes) {
if (typeof(value) == "object") {
value = value.join("");
}
var specials = ['+', '-', '&', '!', '(', ')', '{', '}', '[', ']', '^', '"', '~', '*', '?', ' ', ':', ';', '\\', '/', '|'];
var regexp = new RegExp("(\\" + specials.join("|\\") + ")", "g");
var escapedVal = value.replace(regexp, "\\$1");
if (escapedVal.indexOf('\\') > -1 && addQuotes) {
escapedVal = "\"" + escapedVal + "\"";
}
return escapedVal;
}
Upvotes: 1
Reputation: 1114
This a good use case for the dismax/edismax query parser.
I recommend to first use the qf parameter to set up fields and boosts. Then you can start playing with pf and ps to boost phrase matches within a certain slop. If you are more audacious ( and you need it) you can use shingles as well.
For reference :
https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html
Upvotes: 2
Reputation: 2264
We can boost score for document if given search string appears in particular field.
Example. Document has say 10 fields , one of them is title. Lets say we want to boost score for doc if search string "Searching for Computer Programming" appears in title field. in query you need to pass q=<searchstring> OR <field to boost>:(<searchstring>)^<boost factor>
example:
http://Solrserver:solrport/solr/mycollection/select?q=(Searching for Computer Programming) OR (title:(Searching for Computer Programming)^5)&wt=json&indent=true&debugQuery=true
About proximity search: When you search for "Searching for Computer Programming" instead of Searching for Computer Programming, it is called phrase search. Solr will look for exact phrase match (which is enclosed in "). Proximity search is when solr look for search terms to be closer to each other in given proximity.
Example:
Normal search: Searching for Computer Programming
Phrase search: "Searching for Computer Programming"
Proximity search : "Searching for Computer Programming"~10
Upvotes: 1