How to explicilty define datanodes to store a particular given file in HDFS?

Question

I want to write a script or something like .xml file which explicitly defines the datanodes in Hadoop cluster to store a particular file blocks. for example: Suppose there are 4 slave nodes and 1 Master node (total 5 nodes in hadoop cluster ). there are two files file01(size=120 MB) and file02(size=160 MB).Default block size =64MB

Now I want to store one of two blocks of file01 at slave node1 and other one at slave node2. Similarly one of three blocks of file02 at slave node1, second one at slave node3 and third one at slave node4. So,my question is how can I do this ?

actually there is one method :Make changes in conf/slaves file every time to store a file. but I don't want to do this So, there is another solution to do this ?? I hope I made my point clear. Waiting for your kind response..!!!

Chris White · Accepted Answer

There is no method to achieve what you are asking here - the Name Node will replicate blocks to data nodes based upon rack configuration, replication factor and node availability, so even if you do managed to get a block on two particular data nodes, if one of those nodes goes down, the name node will replicate the block to another node.

Your requirement is also assuming a replication factor of 1, which doesn't give you any data redundancy (which is a bad thing if you lose a data node).

Let the namenode manage block assignments and use the balancer periodically if you want to keep your cluster evenly distibuted

How to explicilty define datanodes to store a particular given file in HDFS?

Answers (2)

Related Questions