Ian
Ian

Reputation: 251

Larger index size after Elasticsearch reindex

After performing a reindex on a 75GB index, the new one went to 79GB.

Both indexes have the same doc count (54,123,676) and both have the exact same mapping. The original index has 6*2 shards and the new one has 3*2 shards.

The original index also has 75,857 deleted documents which were not moved across, so we are pretty stumped as to how it could even be smaller than the new one at all, let alone by a whole 4GB.

Original Index

{
    "_shards": {
        "total": 12,
        "successful": 12,
        "failed": 0
    },
    "_all": {
        "primaries": {
            "docs": {
                "count": 54123676,
                "deleted": 75857
            },
            "store": {
                "size_in_bytes": 75357819717,
                "throttle_time_in_millis": 0
            },
            ...
            "segments": {
                "count": 6,
                "memory_in_bytes": 173650124,
                "terms_memory_in_bytes": 152493380,
                "stored_fields_memory_in_bytes": 17914688,
                "term_vectors_memory_in_bytes": 0,
                "norms_memory_in_bytes": 79424,
                "points_memory_in_bytes": 2728328,
                "doc_values_memory_in_bytes": 434304,
                "index_writer_memory_in_bytes": 0,
                "version_map_memory_in_bytes": 0,
                "fixed_bit_set_memory_in_bytes": 0,
                "max_unsafe_auto_id_timestamp": -1,
                "file_sizes": {}
            }
            ...

New Index

{
    "_shards": {
        "total": 6,
        "successful": 6,
        "failed": 0
    },
    "_all": {
        "primaries": {
            "docs": {
                "count": 54123676,
                "deleted": 0
            },
            "store": {
                "size_in_bytes": 79484557149,
                "throttle_time_in_millis": 0
            },
            ...
            "segments": {
                "count": 3,
                "memory_in_bytes": 166728713,
                "terms_memory_in_bytes": 145815659,
                "stored_fields_memory_in_bytes": 17870464,
                "term_vectors_memory_in_bytes": 0,
                "norms_memory_in_bytes": 37696,
                "points_memory_in_bytes": 2683802,
                "doc_values_memory_in_bytes": 321092,
                "index_writer_memory_in_bytes": 0,
                "version_map_memory_in_bytes": 0,
                "fixed_bit_set_memory_in_bytes": 0,
                "max_unsafe_auto_id_timestamp": -1,
                "file_sizes": {}
            }
            ...

Any clues?

Upvotes: 0

Views: 1411

Answers (1)

ozzimpact
ozzimpact

Reputation: 111

You should use segment merge feature. Since segments are immutable ES always creates new ones and slowly it merges itself. But this request will help you solve your problem.It merges all segments and save memory. But when you send this request, beware of that this request is little heavy. So choose off-peak hours to execute.

POST /_forcemerge?only_expunge_deletes=true

Upvotes: 1

Related Questions