fuujinnn
fuujinnn

Reputation: 21

after upgrading to 4.8 from 3.0.3 Lucene Net performance issue

after migrating from 3.0.3 to 4.8, indexing new document is slower than 3.0.3

but index file size much smaller than 3.0.3.

here my code

private IndexReader reader;
private IndexSearcher searcher;

var writeconfig = new IndexWriterConfig(Lucene.Net.Util.LuceneVersion.LUCENE_48, analyzer);

    writer = new IndexWriter(_directory, writeconfig);

    foreach (var member in list_of_members)
    {

    new_(writer, member.name,member.surname, member.location); 

    }
    writer.Dispose();
    reader = DirectoryReader.Open(index_location);

    searcher = new IndexSearcher(reader);

    public void new_(Lucene.Net.Index.IndexWriter writer, string name, string surname, string location)
    {


        Document doc = new Document();

        doc.Add(new StringField("name", name, Field.Store.YES));
        doc.Add(new TextField("surname", surname, Field.Store.YES));
        doc.Add(new StringField("location", location, Field.Store.YES));

        writer.AddDocument(doc);

            }

when comparing with 3.0.3 indexing new document almost 2x slower than 4.8..

edit 1: found out the performance problem with field compression;

found the this webiste about performance of stored field compression field compression

at web site they explain disabling compressing in java but couldnt convert the code into c#...

now my question is , how can i disable field compression with lucene.net 4.8?

Upvotes: 3

Views: 1037

Answers (1)

eladm
eladm

Reputation: 196

seems like this is compression issue, after version 41, fields store are compressed by default. in this case, compression penalty is too high.

add no compression codec:

 public class NoCompressionCodec : FilterCodec
 {
        internal NoCompressionCodec(Codec @delegate) : base(@delegate)
        {
        }
        public override StoredFieldsFormat StoredFieldsFormat => new Lucene40StoredFieldsFormat();
 }

override the default codec factory

public class CustomCodecFactory : DefaultCodecFactory
{
    private readonly NoCompressionCodec _noCompressionCodec;
    public CustomCodecFactory()
    {
        _noCompressionCodec = new NoCompressionCodec(Codec.Default);
    }

    protected override void Initialize()
    {
        PutCodecType(typeof(NoCompressionCodec));
        base.Initialize();
    }

    protected override Codec GetCodec(Type type)
    {
        if (type == typeof(NoCompressionCodec))
            return _noCompressionCodec;

        return base.GetCodec(type);
    }
}

and run this on your startup

Codec.SetCodecFactory(new CustomCodecFactory());

on your index writer, set codec to:

indexWriterConfig.Codec = new NoCompressionCodec(Codec.Default);

Upvotes: 3

Related Questions