jessica
jessica

Reputation: 2610

How to remove the very large files under /hadoop/hdfs/journal/hdfsha/current/

in our HDP cluster - version 2.6.5 , with ambari platform

we noticed that /hadoop/hdfs/journal/hdfsha/current/ folder include huge files and more then 1000 files as

-rw-r--r--. 1 hdfs hadoop 494690334 Dec 28 11:37 edits_0000000017251672645-0000000017253719335
-rw-r--r--. 1 hdfs hadoop 524892007 Dec 28 12:37 edits_0000000017253719336-0000000017255810613
-rw-r--r--. 1 hdfs hadoop 509365350 Dec 28 14:17 edits_0000000017255810614-0000000017258005682
-rw-r--r--. 1 hdfs hadoop 526756290 Dec 28 15:07 edits_0000000017258005683-0000000017260117992

in order to minimize the journal edit logs we can maybe use the following as part of HDFS ( hdfs-site.xml )

we not sure if the meaning of - dfs.namenode.num.extra.edits.retained is to retained only 100 files

please advice if the following configuration can help to purge the extra journal files in

dfs.namenode.num.extra.edits.retained=100
dfs.namenode.max.extra.edits.segments.retained=1
dfs.namenode.num.checkpoints.retained=1

reference - https://www.ibm.com/support/pages/how-remove-very-large-files-under-hadoophdfsnamecurrent-biginsights-30-save-disk-space

Upvotes: 0

Views: 1152

Answers (2)

linehrr
linehrr

Reputation: 1748

had the same problem, edits start accumulating in the NN and journal nodes. It turned out that the standBy NN is dead. reading the document and found that the merging and cleaning of the edits is the responsibility of the standBy NN.
in the non-HA mode it's the SecondaryNN that's doing this.
so make sure your standBy/Secondary NN is running correctly.

Upvotes: 0

rikamamanus
rikamamanus

Reputation: 936

To clear out the space consumed by jornal edit, you are on right track. However the values are too less and if something goes wrong, you might loose data.

The default value for dfs.namenode.num.extra.edits.retained and dfs.namenode.max.extra.edits.segments.retained is set to 1000000 and 10000 respectively.

I would suggest following values:-

dfs.namenode.num.extra.edits.retained=100000
dfs.namenode.max.extra.edits.segments.retained=100
dfs.namenode.num.checkpoints.retained=2

You can find all these parameter details here, The values can be anything and depends on your environment you have to choose.

Upvotes: 2

Related Questions