Reputation: 3263
In my application I use Hibernate Search to manage a Lucene index of some of my mapped model classes (10 classes, partly associated to each other -- using indexEmbedded
quite some time in the index definitions). There are approx. 1,500,000 documents to index
For rebuilding the whole index, I use a mass indexer as proposed in the documentation http://docs.jboss.org/hibernate/search/3.3/reference/en-US/html/manual-index-changes.html
fullTextSession
.createIndexer()
.batchSizeToLoadObjects(200)
.cacheMode(CacheMode.IGNORE)
.purgeAllOnStart(true)
.threadsToLoadObjects(10)
.threadsForIndexWriter(10)
.threadsForSubsequentFetching(5)
.startAndWait();
My database connection pool has a size of 50
I observe that the indexing procedure starts promising fast until it reached about 25% of all documents. After that the performance declines drastically (the next 5% take twice as long as the first 25%) and I am wondering why this happens?
Because I make use of projections rather than letting Hibernate Search fetch search results from DB, many of my indexed fields are stored in Index (Store.YES
). Does this affect the performance significantly?
-- Edit:
My Hibernate search configuration:
properties.setProperty("hibernate.search.default.directory_provider", "filesystem");
properties.setProperty("hibernate.search.default.indexBase", searchIndexPath);
properties.setProperty("hibernate.search.indexing_strategy", "manual");
properties.setProperty("hibernate.default_batch_fetch_size", "200");
Upvotes: 1
Views: 4365
Reputation: 19129
Have you profiled your application. It is hard to give general recommendations in this case.
Also what configuration settings do you use? There are several properties which can influence the indexing behavior. See http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#search-batchindex-massindexer for more details. What's about memory consumption during indexing. Have you monitored this as well.
Because I make use of projections rather than letting Hibernate Search fetch search results > from DB, many of my indexed fields are stored in Index (Store.YES). Does this affect the performance significantly?
I would expect that it mainly influences the index size not so much the indexing performance.
Upvotes: 2