Reputation: 23
I was trying to write a mapreduce code in java.So here are my files.
mapper class(bmapper):
public class bmapper extends Mapper<LongWritable,Text,Text,NullWritable>{
private String txt=new String();
public void mapper(LongWritable key,Text value,Context context)
throws IOException, InterruptedException{
String str =value.toString();
int index1 = str.indexOf("TABLE OF CONTENTS");
int index2 = str.indexOf("</table>");
int index3 = str.indexOf("MANAGEMENT'S DISCUSSION AND ANALYSIS");
if(index1 == -1)
{ txt ="nil";
}
else
{
if(index1<index3 && index2>index3)
{
int index4 = index3+ 109;
int pageno =str.charAt(index4);
String[] pages =str.split("<page>");
txt = pages[pageno+1];
}
else
{
txt ="nil";
}
}
context.write(new Text(txt), NullWritable.get());
}
}
reducer class(breducer):
public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
public void reducer(Text key,NullWritable value,Context context) throws IOException,InterruptedException{
context.write(key, value);
}
}
driver class (bdriver):
public class bdriver {
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJobName("black coffer");
job.setJarByClass(bdriver.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(NullWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
job.setReducerClass(breducer.class);
job.setMapperClass(bmapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(job, new Path[]{new Path(args[0])});
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
`
I am getting following error.
[training@localhost ~]$ hadoop jar blackcoffer.jar com.test.bdriver /page1.txt /MROUT4
18/03/16 04:38:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
18/03/16 04:38:57 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 04:38:57 WARN snappy.LoadSnappy: Snappy native library is available
18/03/16 04:38:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/03/16 04:38:57 INFO snappy.LoadSnappy: Snappy native library loaded
18/03/16 04:38:57 INFO mapred.JobClient: Running job: job_201803151041_0007
18/03/16 04:38:58 INFO mapred.JobClient: map 0% reduce 0%
18/03/16 04:39:03 INFO mapred.JobClient: Task Id : attempt_201803151041_0007_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
I think it is not able to find Mapper and reducer class. I have written the code in main class, It is getting default Mapper and reducer class
Upvotes: 3
Views: 446
Reputation: 5557
Your input/output type seems compatible with job configuration.
Adding the issue detail and resolution here (As per discussion in the comments, it is confirmed by OP that the issue resolved).
As per Javadoc, The reducer's reduce method is having below signature
protected void reduce(KEYIN key,
Iterable<VALUEIN> values,
org.apache.hadoop.mapreduce.Reducer.Context context)
throws IOException,
InterruptedException
According to it, reducer should be
public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
@Overwrite
public void reducer(Text key,Iterable<NullWritable> value,Context context) throws IOException,InterruptedException{
// Your logic
}
}
The issue was that because of slight difference in the signature of map()
and reduce()
method, the methods were not actually getting overriden
. It was just overloading
the same method names.
The issue was caught after putting @Override
annotation on the map()
and reduce()
function. Although its not mandatory, but as a best practice, always add @Override
annotation on implemented map()
and reduce()
methods.
Upvotes: 1