Reputation: 31
IN map reduce concept under replica and over replica to use. how to balance the over replica and under replica.
Upvotes: 3
Views: 3813
Reputation: 29195
I think you are aware that by default replication factor is 3.
Over-replicated blocks are blocks that exceed their target replication for the file they belong to. Normally, over-replication is not a problem, and HDFS will automatically delete excess replicas. Thats how its balanced in this case.
Under-replicated blocks are blocks that do not meet their target replication for the file they belong to.
To balance these HDFS will automatically create new replicas of under-replicated blocks until they meet the target replication.
You can get information about the blocks being replicated (or waiting to be replicated) using
hdfs dfsadmin -metasave.
if you execute below command, you will get the detailed stats.
hdfs fsck /
......................
Status: HEALTHY
Total size: 511799225 B
Total dirs: 10 Total files: 22
Total blocks (validated): 22 (avg. block size 23263601 B)
Minimally replicated blocks: 22 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 4
Number of racks: 1
The filesystem under path '/' is HEALTHY
Upvotes: 1