adrian
adrian

Reputation: 11

Lucene Search Luke vs Hibernate Search different result

I am running the following lucene query phrase in luke:

+(debtorNumber:10200000 originalDebtorNumber:10200000) +(serviceName:"skype for"^840.0 (serviceName:for* serviceId:for*) (serviceName:skype* serviceId:skype*))

shows at the beginning expected result for ex.:

Skype for Business for Managers

Microsoft Skype for Business Conferencing (Plan2)

Telephone dial-in for Skype for Business Conferencing

and so on.

The same query executed with hibernate search shows different result :/

I am getting for example the following result:

antivirus protection for your PC, notebook or server

central administration for thin clients

skype for comes on the 3rd or 4th page.

The java code is:

SearchManager = Search.getSearchManager(cache)
CacheQuery<MyType> query = searchManager.getQuery(booleanQuery, MyType.class)

List<MyType> pagedResulat = query
                                .maxResults(criteria.getPageSize())
                                .firstResult(Math.toIntExact(criteria.getOffset()))
                                .list()

This logs the above query which I used in Luke

log.info("Lucene Search boolean query:" + booleanQuery);

Please advise.

Upvotes: 0

Views: 244

Answers (1)

Sanne
Sanne

Reputation: 6107

There might be multiple reasons for the difference, let me try compile a checklist.

Different index

The main difference I can think of is that Luke will always target a single index: the one you opened explicitly.

Hibernate Search will actually run the query on a composite view of all indexes containing MyType and indexed subclasses (and any shards you might have). Often that's just one index, but you possibly have multiple indexes opened?

That will affect the results, and definitely the scores.

Different Lucene version

Verify that the Luke version you're using is using the exact same version of Lucene.

Check the scoring

You can use a Projection query to have Infinispan Query / Hibernate Search explain the scores of all results it produced; this can be very useful to understand what is going on.

See FullTextQuery.EXPLANATION and FullTextQuery.SCORE in section Projections, and Example 105.

IndexReader

You can also use the SearchManager to get the low-level IndexReader(s) and run the query directly, by-passing Infinispan and Hibernate Search code.

SearchIntegrator si searchManager.unwrap(SearchIntegrator.class);
si.getIndexReaderAccessor(). ...

that might help narrow down which component is affecting your expected scoring.

The IndexReaderAccessor can open an index by type or by name. When opened by name it will open the single index, when opened by type it will apply the rules to satisfy polymorphic queries and might return an aggregate: might be interesting to experiment with both of them to verify they return the same results.

...and check the basics

Make sure you're opening the same physical index :-)

In particular recent versions of Infinispan might apply sharding transparently to improve data distribution in the cluster, this might be confusing when debugging scoring - especially when you're not aware of it.

Upvotes: 1

Related Questions