Reputation: 275
I'm deploying hadoop as a multi node cluster (distributed mode). But each data node is having different different cluster id.
On slave1,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-2ecca585-6672-476e-9931-4cfef9946c3b
On slave2,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-e24b0548-2d8d-4aa4-9b8c-a336193c006e
I followed this link as well Datanode not starts correctly but I dont know which cluster id I should pick. If I pick any then data node starts on that machine but not on another one. And also when I format namenode using basic command (hadoop namenode - format), datanodes on each slave nodes are started but then namenode on master machine doesn't get started.
Upvotes: 3
Views: 3076
Reputation: 8522
ClusterIDs of datanodes and namenodes should match, then only datanodes can effectively communicate with namenode. If you do namenode format new ClusterID will be assigned for namenodes then ClusterIDs in datanodes won't match.
You can locate a VERSION
files in your /home/pushuser1/hadoop/tmp/dfs/data/current/ (datanode directory ) as well as namenode directory(/home/pushuser1/hadoop/tmp/dfs/name/current/ based on the value your specified for dfs.namenode.name.dir) that contains the ClusterID.
If you are ready for format your hdfs namenode, Stop all HDFS services, Clear out all files inside the following directories
rm -rf /home/pushuser1/hadoop/tmp/dfs/data/* (Need to execute on all data nodes)
rm -rf /home/pushuser1/hadoop/tmp/dfs/name/*
and format hdfs again (hadoop namenode -format
)
Upvotes: 9