Young Kim
Young Kim

Reputation: 23

How does HDFS replicate when there are less nodes than the replication factor?

For instance, if a Hadoop cluster consisted of 2 DataNodes and the HDFS replication factor is set at the default of 3, what is the default behavior for how the files are replicated?

From what I've read, it seems that HDFS bases it on rack awareness, but for cases like this, does anyone know how it is determined?

Upvotes: 2

Views: 4034

Answers (1)

Razvan
Razvan

Reputation: 10101

It will consider the blocks as under-replicated and it will keep complaining about that and it will permanently try to bring them to the expected replication factor.

The HDFS system has a parameter (replication factor - by default 3) which tells the namenode how replicated each block should be (in the default case, each block should be replicated 3 times all over the cluster, according to the given replica placement strategy). Until the system manages to replicate each block as many times as specified by the replication factor, it will keep trying to do that.

Upvotes: 4

Related Questions