Reputation: 1281
I have an ElasticSearch cluster with several indices on 2 data nodes (es-data-0 & es-data-1) and want to move all shards off of node es-data-1 before decommissioning it.
Moving shards 1 at a time works well. The following command takes several seconds to move the shard.
POST /_cluster/reroute
{
"commands": [
{
"move": {
"index" : "index_operations_log",
"shard" : 0,
"from_node" : "es-data-1",
"to_node" : "es-data-0"
}
}
]
}
But if I try to do cluster-level shard allocation filtering, it does not affect. For example, the following has no apparent effect on shard status:
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.rebalance.enable": "none"
}
}
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._name": "es-data-1"
}
}
even though it returns
{
"acknowledged": true,
"persistent": {},
"transient": {
"cluster": {
"routing": {
"allocation": {
"exclude": {
"_name": "es-data-1"
}
}
}
}
}
}
What am I missing?
Upvotes: 1
Views: 1478
Reputation: 1281
Figured it out. I ran
GET /_cluster/settings
and saw that I had set some cluster.routing.allocation
settings from earlier that conflicted with these new ones. I cleared the conflicting rules by setting their values to "" and the shards started moving over.
In general, the
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.allocation.require": "..."
}
}
call doesn't report errors, so I've found the only way to troubleshoot issues like above is - if the shards aren't moving as expected, try moving the shards 1 at a time using POST /_cluster/reroute
. This reports detailed errors. Then, if you're able to move individual shards with POST /_cluster/reroute
but cluster- or index-level shard moving still isn't working, use
GET /_cluster/settings
and
GET /*/settings
to check for other existing routing allocation rules that conflict.
If they exist, they can be reset by doing PUT /_cluster/settings
with their values = "".
Upvotes: 2