Tom Taylor
Tom Taylor

Reputation: 3540

How datanode path is created in hadoop?

Here is my sample snippet which I use to write file to hdfs

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.util.Progressable;

public class WriteFileToHDFS {
  public static void main(String[] args) throws IOException, URISyntaxException 
   {
      System.setProperty("hadoop.home.dir", "/");   
      System.setProperty("HADOOP_USER_NAME", "hdfs");  

      //1. Get the instance of COnfiguration
      Configuration configuration = new Configuration();

      //2. Create an InputStream to read the data from local file
      InputStream inputStream = new BufferedInputStream(new FileInputStream("/Users/rabbit/Research/hadoop/sample_files/TAO.mp4"));  

      //3. Get the HDFS instance
      FileSystem hdfs = FileSystem.get(new URI("hdfs://192.168.143.150:9000"), configuration);  

      //4. Open a OutputStream to write the data, this can be obtained from the FileSytem
      OutputStream outputStream = hdfs.create(new Path("hdfs://192.168.143.150:9000/filestore/TAO.mp4"),   
      new Progressable() {  
              @Override
              public void progress() {
                System.out.println("....");
              }
        });
      try
      {
        IOUtils.copyBytes(inputStream, outputStream, 4096, false); 
      }
      finally
      {
        IOUtils.closeStream(inputStream);
        IOUtils.closeStream(outputStream);
      } 
  }
}

I expect this to be written as /data/hadoop-data/dn/current/blk_1073741869 instead it is written as /data/hadoop-data/dn/current/BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0/blk_1073741869. I do not understand where BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0 - this path got generated?

How the path structure is defined while writing to data node in hadoop?

Upvotes: 2

Views: 386

Answers (2)

Tom Taylor
Tom Taylor

Reputation: 3540

BP stands for "Block Pool", a collection of blocks which are belonging to a single HDFS namespace.

The next part is 1308070615, is a random generated integer.

The IP address 172.22.131.23 is the address of the NameNode that originally created the block pool.

The last part 1533215887051 is the creation time of the namespace.

Upvotes: 0

Abdulhafeth Sartawi
Abdulhafeth Sartawi

Reputation: 1166

The BP stands for "block pool", a collection of blocks belonging to a single HDFS namespace.

This is how hdfs manages data blocks, you can refer to this link to know every thing about it:

https://hortonworks.com/blog/hdfs-metadata-directories-explained/

Upvotes: 1

Related Questions