Jananath Banuka
Jananath Banuka

Reputation: 3953

AWS OpenSearch - this action would add [10] total shards, but this cluster currently has [3996]/[4000] maximum shards open

I have an OpenSearch cluster on AWS and it has some data. The data are ingested from the Kinesis Firehose, but I am getting the following error:

{"attemptsMade":8,"arrivalTimestamp":1660873371793,"errorCode":"400","errorMessage":"{"type":"illegal_argument_exception","reason":"Validation Failed: 1: this action would add [10] total shards, but this cluster currently has [3996]/[4000] maximum shards open;"}"

And I have 4 node cluster, and when I try to get the number of shards allocated for each node as below:

  shards disk.indices disk.used disk.avail disk.total         
   983       89.8gb   115.1gb    376.8gb      492gb           
   983       91.2gb   116.5gb    375.4gb      492gb           
   983       89.1gb   114.5gb    377.5gb      492gb           
   983       90.6gb   115.9gb      376gb      492gb    

In above, what does the shards column mean? Is that the total number (max) of shards that node can accomodate?

And then I tried to get all the indices with their shards, PS: in below, I haven't added all the indices as there are over 3000+ indices, so here's a few of them:

GET _cat/shards?v


index                                                      shard prirep state      docs    store           
mc-2022-08-07                                              4     p      STARTED      23  182.5kb 
mc-2022-08-07                                              4     r      STARTED      23  182.5kb 
mc-2022-08-07                                              2     r      STARTED      13  217.6kb 
mc-2022-08-07                                              3     p      STARTED       9  192.9kb 
mc-2022-08-07                                              1     p      STARTED      10    193kb 
mc-2022-08-07                                              0     p      STARTED      13   71.3kb 
. . .

And I tried to add all the values in the above output's shard column and I ended up getting the value 7506

But according to the error message above, it can't go over 4000 but it already has the value 7506

Can someone help me understand what is happening here? Thank you

Upvotes: 0

Views: 6700

Answers (1)

Val
Val

Reputation: 217514

The first output shows that each of your nodes has 983 shards. It seems that you're trying to add another index with 5 primary shards + one replica for each, which means 10 additional shards.

There's a cluster-wide setting called cluster.max_shards_per_node that prevents having more than 1000 shards per node, in your case 4 x 1000 = 4000. This is of course a default value that can be changed and simply acts as a safety net to not overload the cluster.

The command to lift that limit is this one:

PUT _cluster/settings
{
  "persistent": {
    "cluster.max_shards_per_node": 1100
  }
}

HOWEVER, looking at your second output, we can draw a few points:

  • you have too many shards
  • and probably each of your shards are too small a shard is supposed to contain up to 10-50 GB of data, but in your case we're down to a few KB

We don't see the full output of _cat/shards but depending on your workload, each of your index would probably be fine with a single primary shard. You can specify that at index creation time or in an index template where you define the default mappings and settings for your index.

Another point regarding the output of _cat/shards, the number you see in the shards column is NOT a number of shards, but the id of the shard. If your index has 5 primary shards + one replica, you'll get

  • primary shard 0
  • replica shard 0
  • primary shard 1
  • replica shard 1
  • primary shard 2
  • replica shard 2
  • primary shard 3
  • replica shard 3
  • primary shard 4
  • replica shard 4

If you count the number of lines in the _cat/shards output, however, you'll get 3996 lines.

In summary, I would strongly suggest that you implement Index Lifecycle Management (ILM) or since you're on Opensearch, the feature is called Index State Management (ISM) in order to only create a new index when absolutely necessary.

Also I would advise to reindex small daily indices together into bigger indices (weekly, monthly, etc)

Upvotes: 4

Related Questions