rlperez
rlperez

Reputation: 1328

Found interface org.apache.hadoop.mapreduce.TaskAttemptContext

Haven't seen a solution to my particular problem so far. It isn't working at least. Its driving me pretty crazy. This particular combo doesn't seem to have a lot in the google space. My error occurs as the job does into the mapper from what I can tell. The input to this job are avro schema'd output that is compressed with deflate though I tried uncompressed as well.

Avro: 1.7.7 Hadoop: 2.4.1

I am getting this error and I'm not sure why. Here is my job, mapper, and reduce. The error is happening when the mapper comes in.

Sample uncompressed Avro input file (StockReport.SCHEMA is defined this way)

{"day": 3, "month": 2, "year": 1986, "stocks": [{"symbol": "AAME", "timestamp": 507833213000, "dividend": 10.59}]}

Job

@Override
public int run(String[] strings) throws Exception {
    Job job = Job.getInstance();
    job.setJobName("GenerateGraphsJob");
    job.setJarByClass(GenerateGraphsJob.class);

    configureJob(job);

    int resultCode = job.waitForCompletion(true) ? 0 : 1;

    return resultCode;
}

private void configureJob(Job job) throws IOException {
    try {
        Configuration config = getConf();
        Path inputPath = ConfigHelper.getChartInputPath(config);
        Path outputPath = ConfigHelper.getChartOutputPath(config);

        job.setInputFormatClass(AvroKeyInputFormat.class);
        AvroKeyInputFormat.addInputPath(job, inputPath);
        AvroJob.setInputKeySchema(job, StockReport.SCHEMA$);


        job.setMapperClass(StockAverageMapper.class);
        job.setCombinerClass(StockAverageCombiner.class);
        job.setReducerClass(StockAverageReducer.class);

        FileOutputFormat.setOutputPath(job, outputPath);

    } catch (IOException | ClassCastException e) {
        LOG.error("An job error has occurred.", e);
    }
}

Mapper:

public class StockAverageMapper extends
        Mapper<AvroKey<StockReport>, NullWritable, StockYearSymbolKey, StockReport> {
    private static Logger LOG = LoggerFactory.getLogger(StockAverageMapper.class);

private final StockReport stockReport = new StockReport();
private final StockYearSymbolKey stockKey = new StockYearSymbolKey();

@Override
protected void map(AvroKey<StockReport> inKey, NullWritable ignore, Context context)
        throws IOException, InterruptedException {
    try {
        StockReport inKeyDatum = inKey.datum();
        for (Stock stock : inKeyDatum.getStocks()) {
            updateKey(inKeyDatum, stock);
            updateValue(inKeyDatum, stock);
            context.write(stockKey, stockReport);
        }
    } catch (Exception ex) {
        LOG.debug(ex.toString());
    }
}

Schema for map output key:

    {
  "namespace": "avro.model",
  "type": "record",
  "name": "StockYearSymbolKey",
  "fields": [
    {
      "name": "year",
      "type": "int"
    },
    {
      "name": "symbol",
      "type": "string"
    }
  ]
}

Stack trace:

java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:492)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:735)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Edit: Not that it matters but I'm working to reduce this to data I can create JFreeChart outputs from. Not getting through the mapper so that shouldn't be related.

Upvotes: 7

Views: 6656

Answers (2)

Vladimir Kroz
Vladimir Kroz

Reputation: 5367

The problem is that Avro 1.7.7 supports 2 versions of Hadoop and hence depends on both Hadoop versions. And by default Avro 1.7.7 jars dependend on old Hadoop version. To build with Avro 1.7.7 with Hadoop2 just add extra classifier line to maven dependencies:

    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-mapred</artifactId>
        <version>1.7.7</version>
        <classifier>hadoop2</classifier>
    </dependency>

This will tell maven to search for avro-mapred-1.7.7-hadoop2.jar, not avro-mapred-1.7.7.jar

Same applicable for Avro 1.7.4 and above

Upvotes: 2

Dennis Huo
Dennis Huo

Reputation: 10677

The problem is that org.apache.hadoop.mapreduce.TaskAttemptContext was a class in Hadoop 1 but became an interface in Hadoop 2.

This is one of the reasons why libraries which depend on the Hadoop libs need to have separately compiled jarfiles for Hadoop 1 and Hadoop 2. Based on your stack trace, it appears that somehow you got a Hadoop1-compiled Avro jarfile, despite running with Hadoop 2.4.1.

The download mirrors for Avro provide nice separate downloadables for avro-mapred-1.7.7-hadoop1.jar vs avro-mapred-1.7.7-hadoop2.jar.

Upvotes: 9

Related Questions