Reputation: 1882
Recently, I installed Hadoop and formatted namenode . the namenode started well but the datanodes started failed . here is the datanode error log
STARTUP_MSG: build = [email protected]:hortonworks/hadoop.git -r 3091053c59a62c82d82c9f778c48bde5ef0a89a1; compiled by 'jenkins' on 2018-05-11T07:53Z
STARTUP_MSG: java = 1.8.0_181
************************************************************/
2018-10-17 15:08:42,769 INFO datanode.DataNode (LogAdapter.java:info(47)) - registered UNIX signal handlers for [TERM, HUP, INT]
2018-10-17 15:08:43,665 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(122)) - Scheduling a check for [DISK]file:/hadoop/hdfs/data/
2018-10-17 15:08:43,682 ERROR datanode.DataNode (DataNode.java:secureMain(2692)) - Exception in secureMain
org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value configured for dfs.datanode.failed.volumes.tolerated - 1. Value configured is >= to the number of configured volumes (1).
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:174)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2584)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2493)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2540)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2685)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2709)
2018-10-17 15:08:43,688 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2018-10-17 15:08:43,696 INFO datanode.DataNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hdp2.com/192.168.100.12
what does dfs.datanode.failed.volumes.tolerated - 1
means ? What caused such a mistake?
Upvotes: 1
Views: 8767
Reputation: 768
Remove the following property form hdfs-site.xml file.
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///C:/hadoop-3.3.0/data/datanode</value>
</property>
Upvotes: 0
Reputation: 1882
when I try to solved this problem , I searched the sourceCode
final int volFailuresTolerated =
conf.getInt(DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_KEY,
DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_DEFAULT);
String[] dataDirs = conf.getTrimmedStrings(DFSConfigKeys.DFS_DATANODE_DATA_DIR_KEY);
int volsConfigured = (dataDirs == null) ? 0 : dataDirs.length;
int volsFailed = volsConfigured - storage.getNumStorageDirs();
this.validVolsRequired = volsConfigured - volFailuresTolerated;
if (volFailuresTolerated < 0 || volFailuresTolerated >= volsConfigured) {
throw new DiskErrorException("Invalid volume failure "
+ " config value: " + volFailuresTolerated);
}
if (volsFailed > volFailuresTolerated) {
throw new DiskErrorException("Too many failed volumes - "
+ "current valid volumes: " + storage.getNumStorageDirs()
+ ", volumes configured: " + volsConfigured
+ ", volumes failed: " + volsFailed
+ ", volume failures tolerated: " + volFailuresTolerated);
}
AS you see
The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.
That is the number of disk damage that datanode can tolerate.
In a Hadoop cluster, disk read-only or corruption often occurs. The datanode will use the folder configured under dfs.datanode.data.dir (used to store the block) at startup. If there are some values that cannot be used and the number configured above, the DataNode will fail to start.
In my hadoop environment, fs.datanode.data.dir
is configured as 1 disks, so dfs.datanode.failed.volumes.tolerated
is set to 1, which allows a disk to be bad. There is only one disk under the line, and the values of volFailuresTolerated
and volsConfigured
are both 1, so it will cause the code to fail.
Upvotes: 1
Reputation: 1269
Check hdfs-site.xml. This property must be set to 0 or higher:
dfs.datanode.failed.volumes.tolerated
The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.
Upvotes: 0