Reputation: 197
For example if the replication factor is 3 and there are 2 nodes in cluster. Then how many replicas will be created ? How will they be placed ?
Upvotes: 1
Views: 1173
Reputation: 3421
Having replication factor greater than the available datanodes defeats the purpose of replication. The replicas should be distinctly & uniquely placed on the datanodes. If one datanode contains more than one replicas (theoretically) of the same block, it does not provide additional fault tolerance because if that node goes down, both the replicas are lost. So having only one replica per node is enough.
And to answer your questions:
What is the relationship between replication factor and number of datanodes in cluster? Ans. Maximum replication factor should be less than or equal to #datanodes
If the replication factor is 3 and there are 2 nodes in cluster. Then how many replicas will be created?
Ans. As far as I tried, only 2 replicas are created. (Try usinghdfs dfs -setrep
option)How will they be placed? Ans. They will be placed one per datanode.
Hence when you provide replication factor more than #datanodes, the extra replicas you are trying to create will be mentioned as Missing replicas in the hdfs fsck
output. Also, the corresponding blocks will be treated as Under-Replicated Blocks
Upvotes: 4