DiegoJr
DiegoJr

Reputation: 93

ERROR: Split metadata size exceeded 10000000

I'm getting the error The job initialization failed: java.io.IOException: Split metadata size exceeded 10000000. when I try to run a job on hadoop.

Internet search was indicated for the service of changing a mapreduce.job.split.metainfo.maxsize option to -1 in the mapred-site.xml file. But you can't edit this option via file in the Google cluster.

I also tried setting an option via the command line using -Dmapreduce.job.split.metainfo.maxsize = -1, but to no avail.

I tried to put an option inside the source code using:

Configuration config = new Configuration ();
Config.set ("mapreduce.job.split.metainfo.maxsize", "-1");

But it's still making the same mistake. Alternatives?

Upvotes: 5

Views: 5598

Answers (2)

kevininhe
kevininhe

Reputation: 617

When setting config options, it is important that you implement Tool interface and that you set those options in the run method. Check this to get an example. https://stackoverflow.com/a/33365552/3998212

Upvotes: 0

markers
markers

Reputation: 162

I had the same issue. Two comments:

I'm not sure if the -Dmapreduce.job.split.metainfo.maxsize = -1 would work without spaces, but I don't think the spaces are allowed.

Secondly, it looks like you're setting it in a configuration that your job isn't using. Try using the configuration that your job is using --

job.getConfiguration.set("mapreduce.job.split.metainfo.maxsize", "-1");

That just worked for me. Good Luck.

Upvotes: 5

Related Questions