How can I validate a botched ElasticSearch recovery?

Question

On an older production node I'm running ElasticSearch 6.8.0. I needed to migrate indexes from an even older node to as part of our outline to get up-to-date. These nodes are segregated - not replicating. I had been doing snapshops and recoveries in small batches for ease, but nearing the end of the project I think I bit off more than the node could chew. During the restoration of a large multi-index snapshot (500GB!) the node had a memory constraint issue and went kaput. I had feared the worst, but in order to recover I double the RAM and brought the VM back online. To my surprise, the recovery appears to have completed - all indices and shards are showing 100%! The stats of all indexes match up on both the origin node and the node being migrated to, which seems promising but my experience in our field prevents me from getting any warm and fuzzy feelings yet.

My question: Is this expected from ES - a miracle recovery by some standards? Any foolproof way of validating this? Should I be comfortable with the status and carry on, or should I close the indices that were part of the "failed" snapshot recovery and run the recovery again?

Obviously I'm not a ElasticSearch guru - this tech got dropped in my lap so I'm learning as I go.

Thanks all!

How can I validate a botched ElasticSearch recovery?

Answers (1)

Related Questions