Reputation: 1661
I just started using Hadoop and have been playing around with it. I googled a bit and found out that I have to change the properties in hdfs-site.xml to change the default replication factor... so that's what I did and to be honest it works like a charm. When I add new files they will automatically be replicated with the new replication factor. But when I do something like:
hdfs fsck /
Then the output says that the default replication is 1. I may just be pedantic about it. But I'd rather have that fixed... or should I say. I've been relying on that output and therefore it took a long time before I realised there is nothing wrong... or maybe there is something wrong? Can someone help to interpret that fsck output.
..Status: HEALTHY
Total size: 1375000000 B
Total dirs: 1
Total files: 2
Total symlinks: 0
Total blocks (validated): 12 (avg. block size 114583333 B)
Minimally replicated blocks: 12 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 4
Number of racks: 1
Upvotes: 0
Views: 469
Reputation: 111
Sometimes Hadoop responds to queries with information it has in .xml on the client machine and sometimes on the various server machines. Make sure the hdfs-site.xml file has the same value on the data node, the client node (where you ran hdfs from), and the name node. I maintain a central repository for the configuration files (as customized for the particulars of each node) and globally push them as they change.
Upvotes: 1