Reputation: 1000
I know the default size of chunks in hadoop is 64 MB. I want to change it. how can I do this? Thanks
Upvotes: 2
Views: 1455
Reputation: 38910
Hadoop 1.x : Default size of dfs block is 64 MB
Hadoop 2.x : Default size of dfs block is 128 MB.
Have a look at hdfs-default.xml from official site.
dfs.blocksize
134217728
The default block size for new files, in bytes. You can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide complete size in bytes (such as 134217728 for 128 MB)
Upvotes: 1
Reputation: 974
There are two parameters dfs.block.size (deprecated and the new one is dfs.blocksize) and mapred.max.split.size (deprecated and the new parameter is mapreduce.input.fileinputformat.split.maxsize). When you run a mapreduce program and don't give any value for mapred.max.split.size it takes the default dfs.block.size but you can configure the value and control the number of mappers (but have to be cautious of performance impact esp. when split size is more than block size your mappers seek data over network as data blocks would be spread across nodes).
If you actually want to control the map chunk size it is better to do that for each mapreduce program rather than setting dfs.block.size as it is a global parameter and effects all files being stored in hdfs.
This link talks about the same in great detail - Split size vs Block size in Hadoop
Upvotes: 2
Reputation: 936
In Hadoop version 1.0 default size is 64MB and in version 2.0 default size is 128MB. But If you want to change the block size then go to the hdfs-site.xml file and add the following property
<property>
<name>dfs.block.size<name>
<value>134217728<value>
<description>Block size<description>
<property>
Note: We should mention the size in bits.For example : 134217728 bits = 128 MB.
For further query then go to this link (Extra)
Change Block size of existing files in Hadoop
Upvotes: 4