Vivek Goel
Vivek Goel

Reputation: 24140

Why am I seeing deleted document in Elastic search

I have elastic search cluster, where I am always using doc_as_upsert Update api. I never call DeleteApi. But if I seeing lot of DeletedDocuments metrics in elastic search. Does upsert indirectly call delete and then insert ?

Upvotes: 3

Views: 1117

Answers (1)

Ivan Mamontov
Ivan Mamontov

Reputation: 2924

ElasticSearch(ES) does not support updating of documents, i.e. documents are immutable and we cannot change them. Update API appears to change documents in place, but actually Elasticsearch do the following:

  1. Retrieves the JSON from the old document
  2. Changes document
  3. Delete the old document
  4. Index a new document

Internally, Lucene(ES is a search engine based on the Lucene library) simply marks a bit in a per-segment bitset to record that the document is deleted. All subsequent searches simply skip any deleted documents. This approach is necessary because it would otherwise be far too costly to update Lucene's write-once index data structures like posting lists. You can more about deletions in this blog post

To be honest, Lucene supports in-place updates, but this approach can be used only for updating single valued non-indexed and non-stored docValue-based numeric fields and supported only by Solr

Upvotes: 5

Related Questions