user6826691
user6826691

Reputation: 2011

How to fix red status on opensearch cluster?

We have an opensearch cluster and noticed that the cluster was down. Had the AWS support help me in recovering the cluster but although the cluster is active now, I still see that the cluster is in RED status because one of the shard is unassigned.

Looks like the shard was unassigned during the outage we had with the cluster. I'm not sure how to recover to back green status.

Any suggestion on how to fix this?

Should I delete this shard? would that fix it? I tried reassigning but looks like it does not work since the shard copy is missing. Our backups were also affected when the cluster was down.

GET _cluster/health?pretty

{
  "cluster_name" : "xxxx-xxx-xxx",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "discovered_master" : true,
  "active_primary_shards" : 150,
  "active_shards" : 300,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 4,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 98.68421052631578
}

GET _cluster/allocation/explain?pretty

{
  "index" : ".opendistro-alerting-alerts",
  "shard" : 4,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2022-01-11T13:14:16.096Z",
    "last_allocation_status" : "no_valid_shard_copy"
  },
  "can_allocate" : "no_valid_shard_copy",
  "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",
  "node_allocation_decisions" : [ {
    "node_id" : "xxxxx",
    "node_name" : "sssssssssssssssssssss",
    "node_decision" : "no",
    "store" : {
      "found" : false
    }
  }

Upvotes: 2

Views: 6439

Answers (1)

Riz
Riz

Reputation: 1167

You can check this link to figure out the issue in case you are not sure that's the only reason(shard is unassigned).It can be red for different reasons.
You can try to delete the shard and check but I guess you need to delete the index. You can search the red index with GET /_cat/indices?v. You will lose some data but your cluster will be back to green.
UPDATED: Unassigned shard can't be deleted(because it's not present at all)

Upvotes: 1

Related Questions