Reputation: 143
Given a Widget model that's being indexed into elasticsearch using searchkick:
searchkick word: [:title], highlight: [:title], term_vector: true
And I've indexed these documents:
{ title: "work with puppies" }
{ title: "work with sharks" }
{ title: "work with kittens" }
{ title: "shoot lasers at the moon" }
I'm trying to do a "more like this" (MLT) query for a new bit of text:
"work with lasers"
My goal is to have it hit that last document with the highest score because 'lasers' is more specialized than 'work with', which is common in my document corpus.
I've tried this:
Widget.search query: {
mlt: {
like_text: "work with lasers",
min_term_freq: 1,
boost_terms: 5,
analyzer: 'searchkick_search2'
} }
But it gives me back the "work with..." documents at the top with highest scores.
I've also tried putting in a key/val { stopwords: ['work', 'with'] } but then I get 0 results.
Is there a way I can get searchkick/elasticsearch to give me back documents that have specialized terms with the highest scores and downplay documents that only match commonly-seen terms?
Upvotes: 2
Views: 753
Reputation: 143
(answering my own question for other people's benefit)
Turns out the MLT query doesn't work so well unless you have lots of documents. I put it together with about 1 million docs and the code posted above worked pretty darn well with this:
search query: {
mlt: {
like_text: str,
min_term_freq: 3,
max_query_terms: 35,
boost_terms: 2,
minimum_should_match: '35%'
}
}
YMMV
Upvotes: 6