Harisyam
Harisyam

Reputation: 107

How to find the datanode where a particular file is stored and read from while running an MR Job?

I have 9 files each of size equal to BlockLength of the cluster , stored in hadoop . I need to get the addresses of the datanodes where the files are present . The replication factor is 3 .

Is there any hadoop API to do this or any other possible way ?

Upvotes: 1

Views: 1918

Answers (2)

Tanveer Dayan
Tanveer Dayan

Reputation: 506

To use the java code, you can use the following class

org.apache.hadoop.hdfs.tools.DFSck

Using this method

doWork(final String[] args)

This will create a URI internally and print all the details using System.out.

Upvotes: 0

Tanveer Dayan
Tanveer Dayan

Reputation: 506

The command to find the blocks and data node of a file is as given below

 hadoop fsck /user/tom/part-00007 -files -blocks -racks

This displays the following result

/user/tom/part-00007 25582428 bytes, 1 block(s): OK
0. blk_-3724870485760122836_1035 len=25582428 repl=3 [/default-rack/10.251.43.2:50010,
/default-rack/10.251.27.178:50010, /default-rack/10.251.123.163:50010]

This specifies the datanodes where replicas are placed.

Upvotes: 2

Related Questions