Virender Dubey
Virender Dubey

Reputation: 206

HDFS Data Write Process for different disk size nodes

We have 10 node HDFS (Hadoop - 2.6, cloudera - 5.8) cluster, and 4 are of disk size - 10 TB and 6 node of disk size - 3TB. In that case, Disk is constantly getting full on small size disk nodes, however the disk is free available on high disk size nodes.

I tried to understand, how namenode writes data/block to different disk size nodes. whether it is equally divided or some percentage of data getting written.

Upvotes: 1

Views: 1270

Answers (1)

tk421
tk421

Reputation: 5947

You should look at dfs.datanode.fsdataset.volume.choosing.policy. By default this is set to round-robin but since you have an asymmetric disk setup you should change it to available space.

enter image description here

You can also fine tune disk usage with the other two choosing properties.

For more information see:

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/admin_dn_storage_balancing.html

Upvotes: 1

Related Questions