Reputation: 235
I am getting the following error for a mapreduce job:
Job initialization failed: java.io.IOException: Split metadata size exceeded 10000000. Aborting job job_201511121020_1680 at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48) at org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:828) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:730) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3775) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:90) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
The input path to this job is : /dir1/dir2///year/mon/day ... (7 days)
Here is what I gathered from research : this error is caused because the split meta info size exceeds the limit (set by mapreduce.job.split.metainfo.maxsize). I am assuming this meta data is written to a file and its the size of the file that has exceeded the limit.
I have few more questions :
Any help in better understanding this error is appreciated.
Upvotes: 1
Views: 934
Reputation: 6343
By default max size of split meta information is set to 10000000
public static final long DEFAULT_SPLIT_METAINFO_MAXSIZE = 10000000L
You can override it by setting the configuration parameter: mapreduce.job.split.metainfo.maxsize
, in mapred-site.xml.
Now coming to your questions:
One split file is created per job. The split file is stored in .staging
folder for each job. The name of the split file is job.split
.
The contents of this file are:
1) Split file header: "META-SPL"
2) Split file version: 1
3) Number of splits
4) Information about each split:
a) Locations of the split (a split can be present in 3 locations, if the replication factor is 3),
b) start offset
c) length of the split.
You can find more information about SplitMetaInfo
class here: JobSplit.java
Upvotes: 1