Reputation: 1653
There are many to improve Lucene Indexing performance, I have followed many tips from this site ImproveIndexingSpeed Tips , including:
addDocument updateDocument
, this brought me lots of performance improvement(about 7,8 times faster).The first tip brings good performance improvement, but the second one does not.
I made Document ,Field
static instance so it won't be instantiated every time,(saved creating overhead and resources).
private static Document doc = new Document();
private static Field uinField = new StringField("uin", "", Store.YES);
private static Field nameField = new StringField("name", "", Store.YES);
private static Field urlField = new StringField("url", "", Store.YES);
private static Field servField = new TextField("services", "", Store.YES);
Used Field
setValue
method to change values in it ,then add them to the doc
instance.
uinField.setStringValue(String.valueOf(p.getUin()));
nameField.setStringValue(p.getName());
urlField.setStringValue(p.getUrl());
servField.setStringValue(p.getService());
doc.add(uinField);
doc.add(nameField);
doc.add(urlField);
doc.add(servField);
After I ran the Indexing, the process stuck in the endless loop. I guess it's because of the MultiThread's side effect, it locked the Document and Field
instance and prevent other Threads to addDcoument.
My Question is :
What is wrong about the " Reuse " part ? (I think there must be something wrong with my implementation, because the docs didn't mention that Reuse Document and Field won't compatible with MultiThreads design.
Any suggests about `How to implement Reuse Document and Field ' will be appreciated
Upvotes: 0
Views: 884
Reputation: 101
You don't need to add the fields to the doc on every iteration. You can just add once out of your loop, and use 'field.setValue writer.addDocument' in your loop. just like this one:
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
field1.setStringValue("field1Value");
field2.setStringValue("field2Value");
writer.addDocument(doc);
}
Upvotes: 2
Reputation: 241
En,I have seen the ImproveIndexingSpeed Tips , The Tips "Re-use Document and Field instances " has a note:
"Note that you cannot re-use a single Field instance within a Document, and, you should not change a Field's value until the Document containing that Field has been added to the index. See Field for details. "
So, I think you should make sure the Fields had been written in the index. After it , We can re-use the field instance. But, I didn't have a way to know when to make sure the field had been written in the index. If you has the way, Tell me, thank you.
Apologize for my poor English.
Upvotes: 1