user1971133
user1971133

Reputation: 185

How does HDFS write to a disk on the data node

I not an expert in how File Systems work, but this question can help me clear some vague concepts. How does HDFS write to the physical disk?

I understand HDFS runs on ext3 file system disks (typically). These file systems have a block size way less than the HDFS block size. So If I'm writing a logical HDFS block of 128 MB , the disk could be storing it smaller physical blocks.

Does HDFS ensure these physical blocks are contiguous? (Contiguous blocks increase FS throughput as it minimizes seek time)?

How does HDFS deliver high throughput?

Upvotes: 1

Views: 2412

Answers (1)

woopi
woopi

Reputation: 368

As far as I know HDFS does not care about the physical file system on which it is running. I installed Hadoop on several different file system, for example I also used solaris ZFS.

The blocks of hadoop/hdfs are written as ordinary files on each datanode. The namenode takes to role of inodes or FAT in the OS filesystems. HDFS is a layer above the physical filesystem on each datanode.

You can list the stored blocks in your hadoop/hdfs filesystem just by listing directory content on the data node:

/srv/hadoop/hadoop_data/hdfs/datanode/current/BP-1458088587-192.168.1.51-1394008575227/current/finalized$ ls -alh ./blk_1073741838
-rw-r--r-- 1 hadoop hadoop 1.4M Mar  6 10:55 ./blk_1073741838

Upvotes: 2

Related Questions