Joel
Joel

Reputation: 185

Changing replication factor in hadoop

I'm doing some experiments with hadoop. For that, I have to play with some configuration options like the size of the blocks and the replication factor. For the replication factor, I tried this command :

$HADOOP_HOME/bin/hadoop fs -setrep -w -R $var input

where "input" is the file for which I want to change the replication factor, and $var represents the replication factor I want.

When $var=1, it works. Otherwise, it produces the following error :

Replication 2 set: input
Waiting for input..............................................................
..................................

And keeps on adding points indefinately. What can I do?

Upvotes: 1

Views: 1166

Answers (2)

mattinbits
mattinbits

Reputation: 10428

Since you only have one datanode, HDFS is unable to satisfy your request. The -w flag means to wait until the replication is complete, which it never can be. Hadoop only keeps one copy on each node, so a replication factor greater than 1 is not possible with a single node.

Upvotes: 2

hadooper
hadooper

Reputation: 746

Using the optional option "-w" could take lot of time.. because you're saying to wait for the replication to complete. This can potentially take a very long time.

  • It depends on the size of the file you're setting the replication factor
  • when $var=1 it just has to delete the remaining replicas on different nodes(assuming yours is a multi-node cluster)
  • when $var value is greater than the existing value, it will take lot time because namenode will have to look for which datanode is free and ready to accept the replicas and have to copy the file.
  • If the cluster is busy running any other copy operation this also could cause delay


To check if the replication is completed....

hadoop fsck /path/to/file

Above command shows the number of blocks, locations of the blocks and much more details of the file.

Upvotes: 0

Related Questions