yedapoda
yedapoda

Reputation: 850

What is the maximum value for mapreduce.task.io.sort.mb?

When I set mapreduce.task.io.sort.mb = 100000. I get following exception.

java.lang.Exception: java.io.IOException: Invalid "mapreduce.task.io.sort.mb": 100000

What is the maximum value for mapreduce.task.io.sort.mb?

Upvotes: 7

Views: 9716

Answers (4)

muyexm329
muyexm329

Reputation: 41

hadoop-2.6.0 org.apache.hadoop.mapred.MapTask.java

line 427:we can't set mapreduce.task.io.sort.mb exceed 2047

Upvotes: 4

Zeitnot
Zeitnot

Reputation: 33

By default it is 100MB and it can go up to 2047 MB.

Upvotes: 0

robert towne
robert towne

Reputation: 191

I realize this question is old, but for those asking the same question you can check out some of the bugs around this value being capped

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_releasenotes_hdp_2.1/content/ch_relnotes-hdpch_relnotes-hdp-2.1.1-knownissues-mapreduce.html

BUG-12005: Mapreduce.task.io.sort.mb is capped at 2047.

Problem: mapreduce.task.io.sort.mb is hardcoded to not allow values larger than 2047. If you enter a value larger then this the map tasks will always crash at this line:

https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746

Upvotes: 10

Pradyumna Mohapatra
Pradyumna Mohapatra

Reputation: 373

"mapreduce.task.io.sort.mb" is the total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks. So you need to ensure you have 100000 MB memory available on the Cluster nodes.

Upvotes: 0

Related Questions