Hadoop No space left on device erro when there is space available

Question

I have 5 Linux machines cluster. There are 3 data nodes and one master. At now about 50% hdfs storage is available on each data nodes. But I run a mapreduce job, It is failed with following error

2017-08-21 17:58:47,627 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for blk_6835454799524976171_3615612 bad datanode[0] 10.11.1.42:50010
2017-08-21 17:58:47,628 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_6835454799524976171_3615612 in pipeline 10.11.1.42:50010, 10.11.1.43:50010: bad datanode 10.11.1.42:50010
2017-08-21 17:58:51,785 ERROR org.apache.hadoop.mapred.Child: Error in syncLogs: java.io.IOException: No space left on device

While on each system df -h gives following information

Filesystem               Size  Used Avail Use% Mounted on
devtmpfs                 5.9G     0  5.9G   0% /dev
tmpfs                    5.9G   84K  5.9G   1% /dev/shm
tmpfs                    5.9G  9.1M  5.9G   1% /run
tmpfs                    5.9G     0  5.9G   0% /sys/fs/cgroup
/dev/mapper/centos-root   50G  6.8G   44G  14% /
/dev/sdb                 1.8T  535G  1.2T  31% /mnt/11fd6fcc-1f87-4f1e-a53c-54cc7117759c
/dev/mapper/centos-home  412G  155G  59M  100% /home
/dev/sda1                494M  348M  147M  71% /boot
tmpfs                    1.2G   16K  1.2G   1% /run/user/42
tmpfs                    1.2G     0  1.2G   0% /run/user/1000

As clear from above that my sdb dicsk (SDD) is only 31% used but centos-home is 100%. While hadoop is using local file system in mapreduce job when there is enough HDFS available? Where is the problem? I have search at google and found many such problem but no one covers my situation.

tk421 · Accepted Answer

syncLogs does not use HDFS, it writes to hadoop.log.dir so if you're using MapReduce, look for the value of hadoop.log.dir in /etc/hadoop/conf/taskcontroller.cfg.

If you're using YARN, look for the value of yarn.nodemanager.log-dirs in the yarn-site.xml.

One of these should point you to where you're writing your logs. Once you figure out which filesystem has the problem, you can free space from there.

Another thing to remember is you could get "No space left on device" if you've exhausted your inodes on your disk. df -i would show this.

Hadoop No space left on device erro when there is space available

Answers (2)

Related Questions