Reputation: 379
I'm using Lucene.net version 2.9.1, and facing the following problem when calling Optimize:
I've noticed that some calls to optimize can take hours, and when this take that long period, the process which indexing and optimizing isn't kill-able.
When I used the source code, I managed to track the problem: the call which is causing this behavior is Optimize(int maxNumSegments, bool doWait)
- and within this method there're repetitive calls to OptimizeMergesPending()
which always return true, and the loop keep working and call this method until this call will return otherwise, which can take ages.
This raise the following questions:
1. What can cause OptimizeMergesPending()
keep return true?
2. What can cause the failure of killing the process that indexes and optimizing?
3. Do you know if newer versions of Lucene.net face the same behavior?
Thanks
Upvotes: 0
Views: 239
Reputation: 19781
The xmldocs for IndexWriter.OptimizeMergesPending states that it will return true "if any merges in pendingMerges or runningMerges are optimization merges". The inline documentation for IndexWriter.DoWait states that it will only wait for one second to avoid issues where some notifications may not be triggered, it's up to the caller to reevaluate the waiting conditions. I've linked to the 2.9.4g source code, so newer versions also contains this behavior.
An unkillable process is an operating system issue, you should always be able to kill a process as long as it isn't blocked in a kernel/system call. We would need to see process dumps to debug those issues. (Or a better explanation on how you're trying to kill the process...)
Counter-questions;
IndexWriter.Optimize
? Lucene can handle several segments, in fact, it's easier to reopen indexes when only a few segments have changed than to reopen a completely new segment containing the whole index. You could write your own MergePolicy
if you have issues with the current handling of segments. It has been deprecated as of 3.5, which Lucene.Net currently lags behind (it's up to 3.0.3 at the moment, and porting of 4.x is in progress).lock (this) {...}
which is bad and may cause deadlock issues for you in case you lock on your writer too. This may appear as if your code hangs and any clean thread termination you may have built will not be triggered (since the thread just blocks).IndexWriter.Optimize()
, it will cause unnecessary cpu- and io load, both during the actual merge and when reopening your readers.Upvotes: 4