devj
devj

Reputation: 1153

Hadoop ClassCastException for default value of InputFormat

I'm having a issue getting started with my first map-reduce code on Hadoop. I copied the following code from "Hadoop: The definitive guide" but I'm not able to run it on my single node Hadoop installation.

My Code snippet:

Main:

Job job = new Job(); 
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

System.exit(job.waitForCompletion(true) ? 0 : 1);

Mapper:

public void map(LongWritable key, Text value, Context context)

Reducer:

public void reduce(Text key, Iterable<IntWritable> values,
Context context)

Implementations of map and reduce function are also picked from the book only. But when I try to execute this code, this is the error I get:

INFO mapred.JobClient: Task Id : attempt_201304021022_0016_m_000000_0, Status : FAILED
    java.lang.ClassCastException: interface javax.xml.soap.Text
    at java.lang.Class.asSubclass(Class.java:3027)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:774)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

Answers to similar questions in the past (Hadoop type mismatch in key from map expected value Text received value LongWritable) helped me to figure out that InputFormatClass should match the input to the map function. So I also tried using job.setInputFormatClass(TextInputFormat.class); in my main method, but it also did not solve the issue. What could be the issue here?

Here is the implementation of the Mapper class

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable>     {

private static final int MISSING = 9999;

@Override
public void map(LongWritable key, Text value, Context context)
  throws IOException, InterruptedException {

  String line = value.toString();
  String year = line.substring(15, 19);

  int airTemperature;
  if (line.charAt(45) == '+') { // parseInt doesn't like leading plus signs
    airTemperature = Integer.parseInt(line.substring(46, 50));
  } else {
    airTemperature = Integer.parseInt(line.substring(45, 50));
  }
  String quality = line.substring(50, 51);
  if (airTemperature != MISSING && quality.matches("[01459]")) {
    context.write(new Text(year), new IntWritable(airTemperature));
  }
 }

}

Upvotes: 1

Views: 1757

Answers (2)

USB
USB

Reputation: 6139

You auto imported the wrong import. Instead of import org.apache.hadoop.io.Text you imported import javax.xml.soap.Text

You can find a sample wrong import in this blog.

Upvotes: 3

Chris Gerken
Chris Gerken

Reputation: 16392

Looks like you have the wrong Text class imported (javax.xml.soap.Text). You want org.apache.hadoop.io.Text

Upvotes: 2

Related Questions