Elasticsearch: what does "shard allocation" mean?

Question

We encountered a production incident, that Elasticsearch cluster health check returned red status. The health check report shows marvel-2019.06.20 has 2 unassigned_shards, which seems the root cause.

curl -XGET 'localhost:9200/_cluster/health?level=indices&pretty'

{
  "cluster_name" : "sap-jam-jam8",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 122,
  "active_shards" : 239,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 7,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "indices" : {
     ...
     ...
     ".marvel-2019.06.20" : {
          "status" : "red",
          "number_of_shards" : 1,
          "number_of_replicas" : 1,
          "active_primary_shards" : 0,
          "active_shards" : 0,
          "relocating_shards" : 0,
          "initializing_shards" : 0,
          "unassigned_shards" : 2
    }
  }

we checked the config of Elasticseach, found cluster.routing.allocation has been disabled.

curl -XGET 'localhost:9200/_cluster/settings?pretty'
{
  "persistent" : { },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "enable" : "none"
        }
      }
    }
  }
}

As this stackoverflow post suggested, we forced a shard to be assigned, and this issue has gone.

curl -XPOST -d '{ "commands" : [ {
  "allocate" : {
       "index" : ".marvel-2014.05.21", 
       "shard" : 0, 
       "node" : "SOME_NODE_HERE",
       "allow_primary":true 
     } 
  } ] }' http://localhost:9200/_cluster/reroute?pretty

After resolved this incident, I think it's necessary to figure out the basic concept shard allocation. I did some research, but the following questions are still confusing for me.

1. Why elasticsearch needs to `assign shard` to other nodes?

In my case, we have two elasticsearch nodes, A and B. Two shards have already been created in A, and consumed disk space.

When B is not available, why not just active those two shards in server A?

At least it return a yellow health status.

2. What's the procedures of `assign a shard`?

In the first question, we suppose both primary shard and replica has been created in server A. when saying assign shard to B, what does that mean?

Doest that mean copy shard from server A to server B?

3. How to explain this zero active shard?

Both primary shard and replicate has been created, but are not active. How is it possible? Besides disk storage, is there other overhead to activate a shard? e.g. Memory?

".marvel-2019.06.20" : {
  "status" : "red",
  "number_of_shards" : 1,
  "number_of_replicas" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0, // both shards are inactive.
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 2
}

4. Is the following assumption true?

To make a shard active, Elasticsearch need do the following steps:

Create a shard.
Find a server, which has enough disk space and RAM to run it.
copy this shard from source server to destination server.
Activate this shard.

Elasticsearch: what does "shard allocation" mean?

1. Why elasticsearch needs to `assign shard` to other nodes?

2. What's the procedures of `assign a shard`?

3. How to explain this zero active shard?

4. Is the following assumption true?

Reference

Answers (1)

Related Questions

Elasticsearch: what does &quot;shard allocation&quot; mean?

1. Why elasticsearch needs to assign shard to other nodes?

2. What's the procedures of assign a shard?

3. How to explain this zero active shard?

4. Is the following assumption true?

Reference

Answers (1)

Related Questions

Elasticsearch: what does "shard allocation" mean?

1. Why elasticsearch needs to `assign shard` to other nodes?

2. What's the procedures of `assign a shard`?