Saeed Nasehi
Saeed Nasehi

Reputation: 1000

How can i set the map chunk size in hadoop?

I know the default size of chunks in hadoop is 64 MB. I want to change it. how can I do this? Thanks

Upvotes: 2

Views: 1455

Answers (3)

Ravindra babu
Ravindra babu

Reputation: 38910

Hadoop 1.x : Default size of dfs block is 64 MB

Hadoop 2.x : Default size of dfs block is 128 MB.

Have a look at hdfs-default.xml from official site.

dfs.blocksize

134217728

The default block size for new files, in bytes. You can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide complete size in bytes (such as 134217728 for 128 MB)

Upvotes: 1

Gopi Kolla
Gopi Kolla

Reputation: 974

There are two parameters dfs.block.size (deprecated and the new one is dfs.blocksize) and mapred.max.split.size (deprecated and the new parameter is mapreduce.input.fileinputformat.split.maxsize). When you run a mapreduce program and don't give any value for mapred.max.split.size it takes the default dfs.block.size but you can configure the value and control the number of mappers (but have to be cautious of performance impact esp. when split size is more than block size your mappers seek data over network as data blocks would be spread across nodes).

If you actually want to control the map chunk size it is better to do that for each mapreduce program rather than setting dfs.block.size as it is a global parameter and effects all files being stored in hdfs.

This link talks about the same in great detail - Split size vs Block size in Hadoop

Upvotes: 2

Ankur Singh
Ankur Singh

Reputation: 936

In Hadoop version 1.0 default size is 64MB and in version 2.0 default size is 128MB. But If you want to change the block size then go to the hdfs-site.xml file and add the following property

<property> 
    <name>dfs.block.size<name> 
    <value>134217728<value> 
    <description>Block size<description> 
<property>

Note: We should mention the size in bits.For example : 134217728 bits = 128 MB.

For further query then go to this link (Extra)

Change Block size of existing files in Hadoop

Upvotes: 4

Related Questions