khush
khush

Reputation: 23

error -Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

I was trying to write a mapreduce code in java.So here are my files.

mapper class(bmapper):

public class bmapper extends Mapper<LongWritable,Text,Text,NullWritable>{
    private String txt=new String();
    public void mapper(LongWritable key,Text value,Context context) 
        throws IOException, InterruptedException{

        String str =value.toString(); 
        int index1 = str.indexOf("TABLE OF CONTENTS");
        int index2 = str.indexOf("</table>");
        int index3 = str.indexOf("MANAGEMENT'S DISCUSSION AND ANALYSIS");

        if(index1 == -1)    
        {   txt ="nil";
        }
        else
        {
           if(index1<index3 && index2>index3)
           {
               int index4 = index3+ 109;
              int pageno =str.charAt(index4);
              String[] pages =str.split("<page>");
             txt = pages[pageno+1];
           }
           else
           {
               txt ="nil";

           }
        }

        context.write(new Text(txt), NullWritable.get());
    } 

}

reducer class(breducer):

public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{

    public void reducer(Text key,NullWritable value,Context context) throws IOException,InterruptedException{

        context.write(key, value);

    }

}

driver class (bdriver):

public class bdriver {

    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setJobName("black coffer");
        job.setJarByClass(bdriver.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(NullWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);
        job.setReducerClass(breducer.class);
        job.setMapperClass(bmapper.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        FileInputFormat.setInputPaths(job, new Path[]{new Path(args[0])});
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.waitForCompletion(true);
  }
}

`

I am getting following error.

[training@localhost ~]$ hadoop jar blackcoffer.jar com.test.bdriver /page1.txt /MROUT4
18/03/16 04:38:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
18/03/16 04:38:57 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 04:38:57 WARN snappy.LoadSnappy: Snappy native library is available
18/03/16 04:38:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/03/16 04:38:57 INFO snappy.LoadSnappy: Snappy native library loaded
18/03/16 04:38:57 INFO mapred.JobClient: Running job: job_201803151041_0007
18/03/16 04:38:58 INFO mapred.JobClient:  map 0% reduce 0%
18/03/16 04:39:03 INFO mapred.JobClient: Task Id : attempt_201803151041_0007_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)

I think it is not able to find Mapper and reducer class. I have written the code in main class, It is getting default Mapper and reducer class

Upvotes: 3

Views: 446

Answers (1)

Gyanendra Dwivedi
Gyanendra Dwivedi

Reputation: 5557

Your input/output type seems compatible with job configuration.

Adding the issue detail and resolution here (As per discussion in the comments, it is confirmed by OP that the issue resolved).

As per Javadoc, The reducer's reduce method is having below signature

protected void reduce(KEYIN key,
          Iterable<VALUEIN> values,
          org.apache.hadoop.mapreduce.Reducer.Context context)
               throws IOException,
                      InterruptedException

According to it, reducer should be

public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
    @Overwrite
    public void reducer(Text key,Iterable<NullWritable> value,Context context) throws IOException,InterruptedException{
        // Your logic
    }
}

The issue was that because of slight difference in the signature of map() and reduce() method, the methods were not actually getting overriden. It was just overloading the same method names.

The issue was caught after putting @Override annotation on the map() and reduce() function. Although its not mandatory, but as a best practice, always add @Override annotation on implemented map() and reduce() methods.

Upvotes: 1

Related Questions