Reputation: 2329
I have a snippet that gets results of a search from hibernate search using apache lucene. When I enter a search parameter for instance "college", the results of words starting with college appears at number way bellow of the search results. Considering the result I decided to sort the result set and below is my approach but its not working as expected
org.apache.lucene.search.Query luceneQuery = qb.keyword().fuzzy().withThreshold(.8f)
.withPrefixLength(1).onFields("fieldName").boostedTo(3)
.matching(searchTerm).createQuery();
// org.hibernate.search.FullTextQuery query = s.createFullTextQuery( luceneQuery, MyEntity.class );
// org.apache.lucene.search.Sort sort = new Sort(
// SortField.FIELD_SCORE,
// new SortField("id", SortField.STRING, true));
// luceneQuery.setSort(sort);
// List results = query.list();
From the above snippet, I have to comment out the sorting algorithm I am implementing because of error lines
Upvotes: 0
Views: 457
Reputation: 9977
Hibernate Search sorts by relevance (score) by default, so you shouldn't need to add a custom sort.
If some results are not high enough in the result list, it means their score is not high enough. To control their score, the easiest solution is probably to add more queries. Generally, the more queries a particular document matches, the higher its score.
In this case, you can try something like this:
org.apache.lucene.search.Query fuzzyQuery = qb.keyword().fuzzy().withThreshold(.8f)
.withPrefixLength(1).onFields("fieldName").boostedTo(3)
.matching(searchTerm).createQuery();
org.apache.lucene.search.Query exactQuery = qb.keyword().onFields("fieldName").boostedTo(10)
.matching(searchTerm).createQuery();
org.apache.lucene.search.Query luceneQuery = qb.bool()
.should(fuzzyQuery)
.should(exactQuery)
.createQuery();
Then, documents will match when they contain "college" exactly or approximately, but if they contain "college" exactly, they will match both queries, have a higher score, and appear higher in the result list.
If your question really was about documents that contain the term "college" first, i.e. give a higher score to documents that contain the searched term near the start, then you can probably do it too, but that's a more unusual use case. Just add yet another .should()
clause with a SpanQuery
. You can find more information in this answer.
Upvotes: 1