Nima Mousavi
Nima Mousavi

Reputation: 1661

hdfs fsck displays wrong replication factor

I just started using Hadoop and have been playing around with it. I googled a bit and found out that I have to change the properties in hdfs-site.xml to change the default replication factor... so that's what I did and to be honest it works like a charm. When I add new files they will automatically be replicated with the new replication factor. But when I do something like:

hdfs fsck /

Then the output says that the default replication is 1. I may just be pedantic about it. But I'd rather have that fixed... or should I say. I've been relying on that output and therefore it took a long time before I realised there is nothing wrong... or maybe there is something wrong? Can someone help to interpret that fsck output.

..Status: HEALTHY
 Total size:    1375000000 B
 Total dirs:    1
 Total files:   2
 Total symlinks:        0
 Total blocks (validated):  12 (avg. block size 114583333 B)
 Minimally replicated blocks:   12 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    1
 Average block replication: 2.0
 Corrupt blocks:        0
 Missing replicas:      0 (0.0 %)
 Number of data-nodes:      4
 Number of racks:       1

Upvotes: 0

Views: 469

Answers (1)

David Hazlett
David Hazlett

Reputation: 111

Sometimes Hadoop responds to queries with information it has in .xml on the client machine and sometimes on the various server machines. Make sure the hdfs-site.xml file has the same value on the data node, the client node (where you ran hdfs from), and the name node. I maintain a central repository for the configuration files (as customized for the particulars of each node) and globally push them as they change.

Upvotes: 1

Related Questions