Reputation: 3953
I have an OpenSearch
cluster on AWS
and it has some data. The data are ingested from the Kinesis Firehose
, but I am getting the following error:
{"attemptsMade":8,"arrivalTimestamp":1660873371793,"errorCode":"400","errorMessage":"{"type":"illegal_argument_exception","reason":"Validation Failed: 1: this action would add [10] total shards, but this cluster currently has [3996]/[4000] maximum shards open;"}"
And I have 4 node cluster, and when I try to get the number of shards allocated for each node as below:
shards disk.indices disk.used disk.avail disk.total
983 89.8gb 115.1gb 376.8gb 492gb
983 91.2gb 116.5gb 375.4gb 492gb
983 89.1gb 114.5gb 377.5gb 492gb
983 90.6gb 115.9gb 376gb 492gb
In above, what does the shards
column mean? Is that the total number (max) of shards that node can accomodate?
And then I tried to get all the indices with their shards, PS: in below, I haven't added all the indices as there are over 3000+ indices, so here's a few of them:
GET _cat/shards?v
index shard prirep state docs store
mc-2022-08-07 4 p STARTED 23 182.5kb
mc-2022-08-07 4 r STARTED 23 182.5kb
mc-2022-08-07 2 r STARTED 13 217.6kb
mc-2022-08-07 3 p STARTED 9 192.9kb
mc-2022-08-07 1 p STARTED 10 193kb
mc-2022-08-07 0 p STARTED 13 71.3kb
. . .
And I tried to add all the values in the above output's shard
column and I ended up getting the value 7506
But according to the error message above, it can't go over 4000 but it already has the value 7506
Can someone help me understand what is happening here? Thank you
Upvotes: 0
Views: 6700
Reputation: 217514
The first output shows that each of your nodes has 983 shards. It seems that you're trying to add another index with 5 primary shards + one replica for each, which means 10 additional shards.
There's a cluster-wide setting called cluster.max_shards_per_node
that prevents having more than 1000 shards per node, in your case 4 x 1000 = 4000. This is of course a default value that can be changed and simply acts as a safety net to not overload the cluster.
The command to lift that limit is this one:
PUT _cluster/settings
{
"persistent": {
"cluster.max_shards_per_node": 1100
}
}
HOWEVER, looking at your second output, we can draw a few points:
We don't see the full output of _cat/shards
but depending on your workload, each of your index would probably be fine with a single primary shard. You can specify that at index creation time or in an index template where you define the default mappings and settings for your index.
Another point regarding the output of _cat/shards
, the number you see in the shards column is NOT a number of shards, but the id of the shard. If your index has 5 primary shards + one replica, you'll get
If you count the number of lines in the _cat/shards
output, however, you'll get 3996 lines.
In summary, I would strongly suggest that you implement Index Lifecycle Management (ILM) or since you're on Opensearch, the feature is called Index State Management (ISM) in order to only create a new index when absolutely necessary.
Also I would advise to reindex small daily indices together into bigger indices (weekly, monthly, etc)
Upvotes: 4