Reputation: 3240
I've probably gone through numerous S.O. posts on this issue, but I'm at a loss and can't figure out what the problem is.
I can add and update docs to the index, but I cannot seem to successfully delete them.
I'm using Lucene.NET
v3.0.3
I read one suggestion was to do a query using the same conditions and ensure I'm getting a result back. Well, I did so:
First, I have a method that returns items in my database that have been marked as deleted
var deletedItems = VehicleController.GetDeleted(DateTime lastcheck);
Right now during testing, this includes a single item. I then iterate:
// This method returns my writer
var indexWriter = LuceneController.GetWriter();
// And my searcher
var searcher = new IndexSearcher(indexWriter.GetReader());
// And iterate over my items (just one for testing)
foreach(var c in deletedItems) {
// Here I'm testing by doing a query
var query = new BooleanQuery();
query.Add(new TermQuery(new Term("key", c.Guid.ToString())), Occur.MUST);
// Let's see if it can find the record based on this
var docs = searcher.Search(query, 1);
var foundDoc = docs.FirstOrDefault();
// Yep, we have one... let's get the full doc to be sure
var actualDoc = searcher.Doc(foundDoc.Doc);
// If I inspect actualDoc, it's the right one... I want to delete it.
indexWriter.DeleteDocuments(query);
indexWriter.Commit();
}
I've tried to smash all the logic above so it's easier to read, but I've tried all kinds of methods...
indexWriter.Optimize();
indexWriter.Flush(true, true, true);
If I watch the actual folder where everything is being stored, I can see filenames like 0_1.del
and stuff like that popup, which seems promising.
I then read somewhere about a merge policy, but isn't that what Flush
is supposed to do?
Then read to try setting the optimize method to 1 max, and that still didn't work (i.e. indexWriter.Optimize(1)
).
So using the same query to fetch works, but deleting does not. Why? What else can I check? Does delete actually remove the item permanently or does it live on in some other manner until I completely delete the directory that's being used? Not understanding.
Upvotes: 3
Views: 757
Reputation: 33791
Index segment files in Lucene are immutable they never change once written. So when a deletion is recorded, the deleted record is not actually removed from the index files immediately, the record is simply marked as deleted. The record will eventually be removed from the index once that index segment is merged to produce a new segment. i.e. the deleted record won't be in the new segment that is the result of the merge.
Theoritically, once commit
is called the deletion should be removed from the reader's view since you are getting the reader from the writer (i.e. it's a real time reader) This is documented here:
Note that flushing just moves the internal buffered state in IndexWriter into the index, but these changes are not visible to IndexReader until either commit() or close() is called.
source: https://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/index/IndexWriter.html
But you might want to try closing the reader after the deletion takes place and then getting a new reader from the writer to see if that new reader now has the record removed from visibility.
Upvotes: 1