Reputation: 29
I work with 8 map tasks and 1 reduce task. Although all of the map task attempts are successfully done, map reduce job failed. My example code is from Hadoop Beginner's Guide (Garry Turkington)that is run for skip data.The main idea of program is that testing of task failure in map reduce. Although data that causing failure (skiptext in example)have in source file, the map reduce can do the job successfully. But, I didn't finish job and encounter the job failed .How should I do?
full source code is:
import java.io.IOException;
import org.apache.hadoop.conf.* ;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.* ;
import org.apache.hadoop.mapred.* ;
import org.apache.hadoop.mapred.lib.* ;
public class SkipData
{
public static class MapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, LongWritable>
{
private final static LongWritable one = new
LongWritable(1);
private Text word = new Text("totalcount");
public void map(LongWritable key, Text value,
OutputCollector<Text, LongWritable> output,
Reporter reporter) throws IOException
{
String line = value.toString();
if (line.equals("skiptext"))
throw new RuntimeException("Found skiptext") ;
output.collect(word, one);
}
}
public static void main(String[] args) throws Exception
{
Configuration config = new Configuration() ;
JobConf conf = new JobConf(config, SkipData.class);
conf.setJobName("SkipData");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(MapClass.class);
conf.setCombinerClass(LongSumReducer.class);
conf.setReducerClass(LongSumReducer.class);
FileInputFormat.setInputPaths(conf,args[0]) ;
FileOutputFormat.setOutputPath(conf, new
Path(args[1])) ;
JobClient.runJob(conf);
}
}
The full error console is:
18/02/28 21:12:58 INFO mapreduce.Job: Job job_local724352166_0001 failed with state FAILED due to: NA
18/02/28 21:12:58 WARN mapred.LocalJobRunner: job_local724352166_0001
java.lang.Exception: java.lang.RuntimeException: Found skiptext
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Found skiptext
at mapredpack.SkipTest$MapClass.map(SkipTest.java:23)
at mapredpack.SkipTest$MapClass.map(SkipTest.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner .java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/28 21:12:58 DEBUG security.UserGroupInformation: PrivilegedAction as:naychi (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:758)
18/02/28 21:12:59 DEBUG security.UserGroupInformation: PrivilegedAction as:naychi (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:331)
18/02/28 21:12:59 INFO mapreduce.Job: Counters: 23
File System Counters
FILE: Number of bytes read=29905
FILE: Number of bytes written=2020669
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=128005127
HDFS: Number of bytes written=0
HDFS: Number of read operations=80
HDFS: Number of large read operations=0
HDFS: Number of write operations=7
Map-Reduce Framework
Map input records=1542671
Map output records=1542669
Map output bytes=29310711
Map output materialized bytes=135
Input split bytes=686
Combine input records=1161148
Combine output records=5
Spilled Records=5
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=8601
Total committed heap usage (bytes)=3840933888
File Input Format Counters
Bytes Read=23163911
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873)
at mapredpack.SkipTest.main(SkipTest.java:58)
18/02/28 21:12:59 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: Stopping client
18/02/28 21:12:59 DEBUG ipc.Client: IPC Client (1313916817) connection to localhost/127.0.0.1:9000 from naychi: closed
18/02/28 21:12:59 DEBUG ipc.Client: IPC Client (1313916817) connection to localhost/127.0.0.1:9000 from naychi: stopped, remaining connections 0
Upvotes: 1
Views: 1767
Reputation: 16390
Looks like the code is working as designed. A skiptext line was found and the job is implemented to throw a task-ending exception in that case. This is a common coding technique to force people to implement logic at a certain point. Put a throw RuntimeException() where the code needs to be modified and the developer is forced to look at that part of the code.
Look at the code and decide what you want to do in the case of a skiptext line. Is there additional logic you need to implement, replacing the exception? If so, then replace the thrown exception with the correct behavior.
Upvotes: 1