Reputation: 392
I was reading about HDFS and was wondering, if there is any specific format in which data in a block is arranged.
Suppose there is a file of 265 MB that is copied to a Hadoop cluster and the HDFS block size is 64 MB. So the file is broken into 5 parts- 64 MB + 64 MB + 64 MB + 64 MB + 9 MB, and distributed among data nodes. Correct ?
Anyone, if can answer these doubts/questions, that would be great. Thanks in advance.
Regards,
(*Vipul)() ;
Upvotes: 3
Views: 2389
Reputation: 5018
hdfs fsck / -files -blocks -locations
Here's an example of how the block files are stored with 128MB block size:
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:17 blk_1073741825
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:17 blk_1073741825_1001.meta
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:18 blk_1073741826
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:18 blk_1073741826_1002.meta
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:18 blk_1073741827
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:18 blk_1073741827_1003.meta
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:18 blk_1073741828
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:18 blk_1073741828_1004.meta
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:19 blk_1073741829
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:19 blk_1073741829_1005.meta
-rw-r--r--. 1 hdfs hadoop 134217728 Jan 12 09:19 blk_1073741830
-rw-r--r--. 1 hdfs hadoop 1048583 Jan 12 09:19 blk_1073741830_1006.meta
-rw-r--r--. 1 hdfs hadoop 87776064 Jan 12 09:19 blk_1073741831
-rw-r--r--. 1 hdfs hadoop 685759 Jan 12 09:19 blk_1073741831_1007.meta
Upvotes: 6