Desidero
Desidero

Reputation: 321

Lucene 4.x performance issues

Over the last few weeks I've been working on upgrading an application from Lucene 3.x to Lucene 4.x in hopes of improving performance. Unfortunately, after going through the full migration process and playing with all sorts of tweaks I found online and in the documentation, Lucene 4 is running significantly slower than Lucene 3 (~50%). I'm pretty much out of ideas at this point, and was wondering if anyone else had any suggestions on how to bring it up to speed. I'm not even looking for a big improvement over 3.x anymore; I'd be happy to just match it and stay on a current release of Lucene.

<Edit>

In order to confirm that none of the standard migration changes had a negative effect on performance, I ported my Lucene 4.x version back to Lucene 3.6.2 and kept the newer API rather than using the custom ParallelMultiSearcher and other deprecated methods/classes.

Performance in 3.6.2 is even faster than before:

Since the optimizations and use of the newer Lucene API actually improved performance on 3.6.2, it doesn't make sense for this to be a problem with anything but Lucene. I just don't know what else I can change in my program to fix it.

</Edit>

Application Information

General Processing

1) Request received by socket listener

2) Up to 4 Query objects are generated and populated with normalized user input (all of the required input for a query must be present or it won't be executed)

3) Queries are executed in parallel using the Fork/Join framework

4) Aggregation and other simple post-processing

Other Relevant Info

4.x Hot Spots

Method | Self Time (%) | Self Time (ms)| Self Time (CPU in ms)

java.util.concurrent.CountDownLatch.await() | 11.29% | 140887.219 | 0.0 <- this is just from tcp threads waiting for the real work to finish - you can ignore it
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.<init>() | 9.74% | 21594.03 | 121594
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTermsEnum$Frame.<init>() | 9.59% | 119680.956 | 119680
org.apache.lucene.codecs.lucene41.ForUtil.readBlock() | 6.91% | 86208.621 | 86208
org.apache.lucene.search.DisjunctionScorer.heapAdjust() | 6.68% | 83332.525 | 83332
java.util.concurrent.ExecutorCompletionService.take() | 5.29% | 66081.499 | 6153
org.apache.lucene.search.DisjunctionSucorer.afterNext() | 4.93% | 61560.872 | 61560
org.apache.lucene.search.Tercorer.advance() | 4.53% | 56530.752 | 56530
java.nio.DirectByteBuffer.get() | 3.96% | 49470.349 | 49470
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTerEnum.<init>() | 2.97% | 37051.644 | 37051
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTerEnum.getFrame() | 2.77% | 34576.54 | 34576
org.apache.lucene.codecs.MultiLevelSkipListReader.skipTo() | 2.47% | 30767.711 | 30767
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.newTertate() | 2.23% | 27782.522 | 27782
java.net.ServerSocket.accept() | 2.19% | 27380.696 | 0.0
org.apache.lucene.search.DisjunctionSucorer.advance() | 1.82% | 22775.325 | 22775
org.apache.lucene.search.HitQueue.getSentinelObject() | 1.59% | 19869.871 | 19869
org.apache.lucene.store.ByteBufferIndexInput.buildSlice() | 1.43% | 17861.148 | 17861
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTerEnum.getArc() | 1.35% | 16813.927 | 16813
org.apache.lucene.search.DisjunctionSucorer.countMatches() | 1.25% | 15603.283 | 15603
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs() | 1.12% | 13929.646 | 13929
java.util.concurrent.locks.ReentrantLock.lock() | 1.05% | 13145.631 | 8618
org.apache.lucene.util.PriorityQueue.downHeap() | 1.00% | 12513.406 | 12513
java.util.TreeMap.get() | 0.89% | 11070.192 | 11070
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs() | 0.80% | 10026.117 | 10026
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTerEnum$Frame.decodeMetaData() | 0.62% | 7746.05 | 7746
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader.iterator() | 0.60% | 7482.395 | 7482
org.apache.lucene.codecs.BlockTreeTerReader$FieldReader$SegmentTerEnum.seekExact() | 0.55% | 6863.069 | 6863
org.apache.lucene.store.DataInput.clone() | 0.54% | 6721.357 | 6721
java.nio.DirectByteBufferR.duplicate() | 0.48% | 5930.226 | 5930
org.apache.lucene.util.fst.ByteSequenceOutputs.read() | 0.46% | 5708.354 | 5708
org.apache.lucene.util.fst.FST.findTargetArc() | 0.45% | 5601.63 | 5601
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock() | 0.45% | 5567.914 | 5567
org.apache.lucene.store.ByteBufferIndexInput.toString() | 0.39% | 4889.302 | 4889
org.apache.lucene.codecs.lucene41.Lucene41SkipReader.<init>() | 0.33% | 4147.285 | 4147
org.apache.lucene.search.TermQuery$TermWeight.scorer() | 0.32% | 4045.912 | 4045
org.apache.lucene.codecs.MultiLevelSkipListReader.<init>() | 0.31% | 3890.399 | 3890
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock() | 0.31% | 3886.194 | 3886


If there's any other information you could use that might help, please let me know.

Upvotes: 4

Views: 1867

Answers (1)

Desidero
Desidero

Reputation: 321

For anyone who cares or is trying to do something similar (controlled parallelism within a query), the problem I had was that the IndexSearcher was creating a task per segment per shard rather than a task per shard - I misread the javadoc.

I resolved the problem by using forceMerge(1) on my shards to limit the number of extra threads. In my use case this isn't a big deal since I don't currently use NRT search, but it still adds unnecessary complexity to the update + slave synchronization process, so I'm looking into ways to avoid the forceMerge.

As a quick fix, I'll probably just extend the IndexSearcher and have it spawn a thread per reader instead of a thread per segment, but the idea of a "virtual segment" was brought up in the Lucene mailing list. That would be a much better long-term fix.

If you want to see more info, you can follow the lucene mailing list thread here: http://www.mail-archive.com/[email protected]/msg42961.html

Upvotes: 2

Related Questions