Marcin Sanecki
Marcin Sanecki

Reputation: 1334

Lucene migration from 3.x to 4.1.0 and index optimisation

I have migrated from lucene 3.x to 4.1.0. After creating new index I realise there is much more files in the index directory. lucene 3 uses IndexWriter.optimize() to collapse files. The succesor in v4 is IndexWriter.forceMerge(int maxNumSegments). I have tried forceMerge with different values for maxNumSegments and I get always the same index files. I expect the files to be merge together into one, or at least less, index files. Am I wrong? Do you know how to do it?

Upvotes: 0

Views: 1648

Answers (2)

jpountz
jpountz

Reputation: 9964

Maybe you are looking for Lucene's compound file format which stores all logical index files in a single actual file. See MergePolicy.setUseCompoundFile(true).

Upvotes: 3

mindas
mindas

Reputation: 26713

Apart from ideological (less files better than more), are there any practical reasons why do you need less files? Providing the overall number of bytes for given index is roughly the same, what's the difference?

The reason why optimization was removed because it was inefficient: it would kill search performance, result load spikes, etc. Performance over searching over multiple segments has improved and the need to .optimize() is no longer justifiable. Lucene now uses TieredMergePolicy instead which nicely balances the load and solves this problem from a different angle.

Upvotes: 6

Related Questions