backtrack
backtrack

Reputation: 8144

hadoop - HDFS file distribution

I just started to play with Hadoop and I'm having the following doubt: We know well that the Namenode has "MetaData" information about the input blocks. Now my questions are:

  1. How can I view or query the metadata?
  2. How can I see - how my input file is blocked and distributed?
  3. How can I make sure that my input file is blocked and distributed in the HDFS?

PS: I have referred the following site already:

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

Thanks!

Upvotes: 1

Views: 471

Answers (2)

Tariq
Tariq

Reputation: 34184

  1. How can I view or query the metadata?

    You can do that with the help of Offline Image Viewer. It is a tool to dump the contents of fsimage files to human-readable formats in order to allow offline analysis and examination of an Hadoop cluster's namespace.

    Usage :

    bin/hdfs oiv -i fsimage -o fsimage.txt

    You can find more on this here.

  2. How can I see - how my input file is blocked and distributed?

    Easiest way would be to point your web browser to HDFS webUI, i.e namemnode_machine:50070. Then browse to the file in question and click to open it. Scroll down and you can see the location of each block of this file.

    Alternatively, you can use getFileBlockLocations(FileStatus file, long start, long len) provided by FileSystem API. It return an array containing hostnames, offset and size of portions of the given file.

  3. How can I make sure that my input file is blocked and distributed in the HDFS?

    You can use fsck to do that. It will show you all the necessary stuff, like Total blocks, Minimally replicated blocks, Under-replicated blocks etc related to a particular file.

Upvotes: 4

user2486495
user2486495

Reputation: 1729

Namenode's metadata is stored in a file called "fsimage". You can go through the below link for your reference

Content of the fsimage hdfs

Upvotes: 0

Related Questions