Reputation: 273
We operate an elasticsearch stack which uses 3 nodes to store log data. Our current config is to have indices with 3 primaries and 1 replica. (We have just eyeballed this config and are happy with the performance, so we decided to not (yet) spend time for optimization)
After a node outage (let's assume a full disk), I have observed that elasticsearch automatically redistributes its shards to the remaining instances - as advertised.
However this increases disk usage on the remaining two instances, making it a candidate for cascading failure.
Durability of the log data is not paramount. I am therefore thinking about reconfiguring elasticsearch to not create a new replica after a node outage. Instead, it could just run on the primaries only. This means that after a single node outage, we would run without redundancy. But that seems better than a cascading failure. (This is a one time cost)
An alternative would be to just increase disk size. (This is an ongoing cost)
(How) can I configure elasticsearch to not create new replicas after the first node has failed? Or is this considered a bad idea and the canonical way is to just increase disk capacity?
Upvotes: 1
Views: 130
Reputation: 273
When a node leaves the cluster, some additional load is generated on the remaining nodes:
This can lead to quite some data being moved around.
Sometimes, a node is only missing for a short period of time. A full rebalance is not justified in such a case. To take account for that, when a node goes down, then elasticsearch immediatelly promotes a replica shard to primary for each primary that was on the missing node, but then it waits for one minute before creating new replicas to avoid unnecessary copying.
The duration of this delay is a tradeoff and can therefore be configured. Waiting longer means less chance of useless copying but also more chance for downtime due to reduced redundancy.
Increasing the delay to a few hours results in what I am looking for. It gives our engineers some time to react, before a cascading failure can be created from the additional rebalancing load.
I learned that from the official elasticsearch documentation.
Upvotes: 1