Reputation: 847
The default block size in Hadoop 2.x is 128MB. What is the problem with 64MB?
Upvotes: 3
Views: 4674
Reputation: 3208
HDFS's block size are so huge to minimize seek time. The optimal block size depends on average file size, seek time and transfer rate.
The faster the disk, the bigger the data block but there is a limit.
To take advantage of data locality splits have the same size of data blocks, since we start a thread for each split, too big blocks reduce the parallelism. So the best is:
128MB is the today good choice for today disk speed and size and computing performance.
Upvotes: 2
Reputation: 38940
There are some reasons for increase in block size. It improves the performance if you are managing big Hadoop cluster of peta bytes of data.
if you are managing a cluster of 1 peta bytes, 64 MB block size results into 15+ million blocks, which is difficult for Namenode to manage efficiently.
Having a lot of blocks will also result in a lot of mappers during MapReduce execution.
Depending on your data requirement, you can fine tunedfs.blocksize
By properly setting your block size (64MB or 128 Mb or 256 MB or 512 MB) , you can acheive
Refer to this link for more details.
Upvotes: 4