grep
grep

Reputation: 5623

Why double amount of memory is used for Name Node files?

the Cloudera blog or in hortonwork forum I read::

"Every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies 150 bytes, as a rule of thumb. So 10 million files, each using a block, would use about 3 gigabytes of memory"

BUT:

10000000 * 150 = 1500000000 byte = 1.5 GB.

Looks like For 3GB I need to allocate 300 bytes. I don't understand why 300 bytes are used for each file instead of 150? It's just NameNode. There should not be any replication factor.

Thanks

Upvotes: 1

Views: 162

Answers (1)

gudok
gudok

Reputation: 4179

For every small file, namenode needs to store two objects in memory: per-file object and per-block object. This results in approximately 300 bytes per single file.

Upvotes: 2

Related Questions