Reputation: 1
I am new to elasticsearch.Suppose we have a two node cluster and have a config of 2 primary shards and one replica for our single index.So node 1 has P0,R1 and node 2 has P1,R0. Now suppose later on I reduce the number of replicas to 0.Then will the shards P0 and P1 automatically resize themselves to occupy the disk space vacated by replicas and allow me greater disk space for indexing then previously when I had replicas.
Upvotes: 0
Views: 474
Reputation: 217254
A replica shard takes more or less the same space as its primary since both contain the same documents. So, say, you have indexed 1 million documents in your index, then each primary shard contains more or less half that amount of documents, i.e. 500K document and each replica contains the same number of documents as well.
If each document weighs 1KB, then:
Which means that your index occupies 2GB of disk space on your node. If you later reduce the number of replicas to 0, then that will free up 1GB of space that your primary shards will be able to occupy, indeed.
However, note that by doing so, you certainly gain disk space, but you won't have any redundancy anymore and you will not be able to spread your index over two nodes, which is the main idea behind replicas to begin with.
The other thing is that the size of a shard is bounded by a physical limit that it will not be able to cross. That limit is dependent on many factors, among which the amount of heap and the total physical memory you have. If you have 2GB of heap and 50GB of disk space, you cannot expect to index 50GB of data into your index, that won't work, or will be very slow and unstable.
=> So the disk space only should not be the main driver for sizing your shards. Having enough disk space is necessary condition but not a sufficient one, you also need to look at the RAM and the heap allocated to your ES node.
Upvotes: 2