Bohdan
Bohdan

Reputation: 17213

What does "Heap Size" mean for Hadoop Namenode?

I'm trying to understand if there is something wrong with my Hadoop cluster. When I go to web UI in cluster summary it says:

Cluster Summary

XXXXXXX files and directories, XXXXXX blocks = 7534776 total.
Heap Size is 1.95 GB / 1.95 GB (100%) 

And I'm concerned about why is this Heap size metric at 100%

Could someone please provide some explanation how namenode heap size impact cluster performance. And whether this needs to be fixed.

Upvotes: 3

Views: 15257

Answers (1)

Remus Rusanu
Remus Rusanu

Reputation: 294407

The namenode Web UI shows the values as this:

<h2>Cluster Summary (Heap Size is <%= StringUtils.byteDesc(Runtime.getRuntime().totalMemory()) %>/<%= StringUtils.byteDesc(Runtime.getRuntime().maxMemory()) %>)</h2>

The Runtime documents these as:

  • totalMemory() Returns the total amount of memory in the Java virtual machine.
  • maxMemory() Returns the maximum amount of memory that the Java virtual machine will attempt to use

Max is going to be the -Xmx parameter from the service start command. The total memory main factor is the number of blocks in your HDFS cluster. The namenode requires ~150 bytes for each block, +16 bytes for each replica, and it must be kept in live memory. So a default replication factor of 3 gives you 182 bytes, and you have 7534776 blocks gives about 1.3GB. Plus all other non-file related memory in use in the namenode, 1.95GB sounds about right. I would say that your HDFS cluster size requires a bigger namenode, more RAM. If possible, increase namenode startup -Xmx. If maxed out, you'll need a bigger VM/physical box.

Read The Small Files Problesm, HDFS-5711.

Upvotes: 6

Related Questions