Reputation: 8678
I have 5 Linux machines cluster. There are 3 data nodes and one master. At now about 50% hdfs storage is available on each data nodes. But I run a mapreduce job, It is failed with following error
2017-08-21 17:58:47,627 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for blk_6835454799524976171_3615612 bad datanode[0] 10.11.1.42:50010
2017-08-21 17:58:47,628 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_6835454799524976171_3615612 in pipeline 10.11.1.42:50010, 10.11.1.43:50010: bad datanode 10.11.1.42:50010
2017-08-21 17:58:51,785 ERROR org.apache.hadoop.mapred.Child: Error in syncLogs: java.io.IOException: No space left on device
While on each system df -h
gives following information
Filesystem Size Used Avail Use% Mounted on
devtmpfs 5.9G 0 5.9G 0% /dev
tmpfs 5.9G 84K 5.9G 1% /dev/shm
tmpfs 5.9G 9.1M 5.9G 1% /run
tmpfs 5.9G 0 5.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 50G 6.8G 44G 14% /
/dev/sdb 1.8T 535G 1.2T 31% /mnt/11fd6fcc-1f87-4f1e-a53c-54cc7117759c
/dev/mapper/centos-home 412G 155G 59M 100% /home
/dev/sda1 494M 348M 147M 71% /boot
tmpfs 1.2G 16K 1.2G 1% /run/user/42
tmpfs 1.2G 0 1.2G 0% /run/user/1000
As clear from above that my sdb dicsk (SDD) is only 31% used but centos-home is 100%. While hadoop is using local file system in mapreduce job when there is enough HDFS available? Where is the problem? I have search at google and found many such problem but no one covers my situation.
Upvotes: 1
Views: 2990
Reputation: 5185
Please check how many inodes are used. If I undertand it right, if it is still the full disk, but all inodes has gone, the error would be still the same, "no space left".
Upvotes: 0
Reputation: 5967
syncLogs
does not use HDFS, it writes to hadoop.log.dir
so
if you're using MapReduce, look for the value of hadoop.log.dir
in /etc/hadoop/conf/taskcontroller.cfg.
If you're using YARN, look for the value of yarn.nodemanager.log-dirs
in the yarn-site.xml.
One of these should point you to where you're writing your logs. Once you figure out which filesystem has the problem, you can free space from there.
Another thing to remember is you could get "No space left on device" if you've exhausted your inodes on your disk. df -i
would show this.
Upvotes: 1