Reputation: 19398
steps: - elasticsearch 2.3 - create documents in ES => ~ 1 GB of disk is used - update same documents in ES => ~ 2 GB of disk is used
Why it happens? Is it due to versioning? Is it possible to avoid doubling disk usage?
Currently we use forcemerge (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html) but it takes some hours.
Upvotes: 2
Views: 482
Reputation: 217314
When you index a document in ES that already exists, ES will mark the previous document as deleted (but won't immediately remove it from the index), and index the new document.
Effectively, if your document weighs 1K, once you have reindexed a new version of your document, the space taken by the first document won't be reclaimed immediately. So, the first "version" of the document takes 1K and the second "version" of the document another 1K. The only way to remove deleted documents is to call the Force Merge API as you have discovered, or to wait until segments are merged automatically under the hood. You should not really have to worry about this process.
Upvotes: 4