Reputation: 21
after migrating from 3.0.3 to 4.8, indexing new document is slower than 3.0.3
but index file size much smaller than 3.0.3.
here my code
private IndexReader reader;
private IndexSearcher searcher;
var writeconfig = new IndexWriterConfig(Lucene.Net.Util.LuceneVersion.LUCENE_48, analyzer);
writer = new IndexWriter(_directory, writeconfig);
foreach (var member in list_of_members)
{
new_(writer, member.name,member.surname, member.location);
}
writer.Dispose();
reader = DirectoryReader.Open(index_location);
searcher = new IndexSearcher(reader);
public void new_(Lucene.Net.Index.IndexWriter writer, string name, string surname, string location)
{
Document doc = new Document();
doc.Add(new StringField("name", name, Field.Store.YES));
doc.Add(new TextField("surname", surname, Field.Store.YES));
doc.Add(new StringField("location", location, Field.Store.YES));
writer.AddDocument(doc);
}
when comparing with 3.0.3 indexing new document almost 2x slower than 4.8..
edit 1: found out the performance problem with field compression;
found the this webiste about performance of stored field compression field compression
at web site they explain disabling compressing in java but couldnt convert the code into c#...
now my question is , how can i disable field compression with lucene.net 4.8?
Upvotes: 3
Views: 1037
Reputation: 196
seems like this is compression issue, after version 41, fields store are compressed by default. in this case, compression penalty is too high.
add no compression codec:
public class NoCompressionCodec : FilterCodec
{
internal NoCompressionCodec(Codec @delegate) : base(@delegate)
{
}
public override StoredFieldsFormat StoredFieldsFormat => new Lucene40StoredFieldsFormat();
}
override the default codec factory
public class CustomCodecFactory : DefaultCodecFactory
{
private readonly NoCompressionCodec _noCompressionCodec;
public CustomCodecFactory()
{
_noCompressionCodec = new NoCompressionCodec(Codec.Default);
}
protected override void Initialize()
{
PutCodecType(typeof(NoCompressionCodec));
base.Initialize();
}
protected override Codec GetCodec(Type type)
{
if (type == typeof(NoCompressionCodec))
return _noCompressionCodec;
return base.GetCodec(type);
}
}
and run this on your startup
Codec.SetCodecFactory(new CustomCodecFactory());
on your index writer, set codec to:
indexWriterConfig.Codec = new NoCompressionCodec(Codec.Default);
Upvotes: 3