Reputation: 2546
I´m having repeated crashes in my Cloudera cluster HDFS Datanodes due to an OutOfMemoryError
:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/hdfs_hdfs-DATANODE-e26e098f77ad7085a5dbf0d369107220_pid18551.hprof ...
Heap dump file created [2487730300 bytes in 16.574 secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/usr/lib64/cmf/service/common/killparent.sh"
# Executing /bin/sh -c "/usr/lib64/cmf/service/common/killparent.sh"...
18551 TS 19 ? 00:25:37 java
Wed Aug 7 11:44:54 UTC 2019
JAVA_HOME=/usr/lib/jvm/java-openjdk
using /usr/lib/jvm/java-openjdk as JAVA_HOME
using 5 as CDH_VERSION
using /run/cloudera-scm-agent/process/3087-hdfs-DATANODE as CONF_DIR
using as SECURE_USER
using as SECURE_GROUP
CONF_DIR=/run/cloudera-scm-agent/process/3087-hdfs-DATANODE
CMF_CONF_DIR=/etc/cloudera-scm-agent
4194304
When analyzing the heap dump, the apparent biggest suspects are millions of instances of ScanInfo
apparently quequed in the ExecutorService
of the class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner
.
When I inspect the content of each ScanInfo
runnable object, I don´t see anything weird:
Apart from this and a bit high block count in HDFS, I don´t get any other information apart from the different DataNodes crashing randomly in my cluster.
Any idea why these objects keep queueing up in the DirectoryScanner
thread pool?
Upvotes: 0
Views: 1149
Reputation: 601
You can try once below command.
$ hadoop dfsadmin -finalizeUpgrade The -finalizeUpgrade command removes the previous version of the NameNode’s and DataNodes’ storage directories.
Upvotes: 1